RE: [squid-users] How to not cache a site?

2008-06-11 Thread Jerome Yanga
It took me sometime to find this email until a friend pointed it out.
You brought up the point that we have been trying to solve by not
caching the site.

Here is the full story.  There is a site behind our Reverse Proxy that
keeps on getting funky due to missing icons and some pages that does not
follow the formatting.  Looking into this issue, we realized that even
if we configured our HTTP Headers not to cache this site, I still find
instances of that site in the cache during purging.  This is what
started this post.

We have compiled our Apache to use mod_auth_session to assist in
security of the site.

What you have provided below seems to have pointed us to resolving this
issue.  

The redirection just has a Cache-Control: max-age=0, which allows the

cache to store the response, and just requires that it be revalidated 
(which is done as evidenced by the TCP_REFRESH_HIT in the Squid log).

It seems that when the responses got stored, the authentication gets
funky and in effect some objects referenced in the page cannot be
access.  We will modify our mod_auth_session code to avoid this
condition.  If neither the recoding of mod_auth_session nor adding the
cache deny directives does work, I will update this post.

Thank you so much, Chris.

Regards,
Jerome



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 10, 2008 1:12 PM
To: squid-users@squid-cache.org
Subject: Re: [squid-users] How to not cache a site?

Jerome Yanga wrote:
 Resending as I had received a failure notice message.

 I do not think that the refresh_pattern is even setup as they are all
 commented out.

 # grep refresh_pattern /etc/squid/squid.conf
 # refresh_pattern regex min percent max
 #refresh_pattern -i \.js$   0   0%  1
 #refresh_pattern -i \.css$  0   10% 30
 #refresh_pattern .  0   20% 4320

 Attached is a zipped http header log captured using Live HTTP Headers.

 Regards,
 Jerome
   

Sample squid log entry from the zip file (without cookies) for
reference:


TCP_REFRESH_HIT:FIRST_UP_PARENT 10.11.12.13 10.10.10.10 - - 
[06/Jun/2008:21:42:52 +] GET 
http://site_address.com/help/chr_ind_on.gif HTTP/1.1 302 830 
http://site_address.com/help/whskin_tbars.htm; Mozilla/5.0 (Windows; 
U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14


There were no associated HTTP headers for this object 
(http://site_address.com/help/chr_ind_on.gif)*, but here is another 
request that also resulted in a 302 (Moved Temporarily):


GET /help/chr_back.gif HTTP/1.1
Host: site_address.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14)

Gecko/20080404 Firefox/2.0.0.14
Accept: image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://site_address.com/help/whskin_tbars.htm
Cookie: [removed]

HTTP/1.x 302 Moved Temporarily
Date: Thu, 05 Jun 2008 23:40:54 GMT
Location: 
http://site_address.com/gateway/index.cfm?fa=loginreturnURL=http%3A%2F%
2Fsite_address%2Ecom%2Fhelp%2FFchr%5Fback%2Egif
Cache-Control: max-age=0
Expires: Thu, 05 Jun 2008 23:40:54 GMT
Content-Length: 422
Content-Type: text/html; charset=iso-8859-1
Connection: keep-alive


The redirection just has a Cache-Control: max-age=0, which allows the 
cache to store the response, and just requires that it be revalidated 
(which is done as evidenced by the TCP_REFRESH_HIT in the Squid log).

So, I'm still not seeing anything being cached against the server's 
request.  Try tailing the access log and grep for  200  and HIT** 
(note the spaces on either end of the 200).  That should show any 
objects (as opposed to redirects or errors) that are served from cache.

Chris

* The other URL (http://site_address.com/help/whskin_tbars.htm) it the 
referrer.
** tail -f /cache/logs/access.log | egrep 10.10.10.10.* 200 .*HIT






Re: [squid-users] How to not cache a site?

2008-06-10 Thread Chris Robertson

Jerome Yanga wrote:

Resending as I had received a failure notice message.

I do not think that the refresh_pattern is even setup as they are all
commented out.

# grep refresh_pattern /etc/squid/squid.conf
# refresh_pattern regex min percent max
#refresh_pattern -i \.js$   0   0%  1
#refresh_pattern -i \.css$  0   10% 30
#refresh_pattern .  0   20% 4320

Attached is a zipped http header log captured using Live HTTP Headers.

Regards,
Jerome
  


Sample squid log entry from the zip file (without cookies) for reference:


TCP_REFRESH_HIT:FIRST_UP_PARENT 10.11.12.13 10.10.10.10 - - 
[06/Jun/2008:21:42:52 +] GET 
http://site_address.com/help/chr_ind_on.gif HTTP/1.1 302 830 
http://site_address.com/help/whskin_tbars.htm; Mozilla/5.0 (Windows; 
U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14



There were no associated HTTP headers for this object 
(http://site_address.com/help/chr_ind_on.gif)*, but here is another 
request that also resulted in a 302 (Moved Temporarily):



GET /help/chr_back.gif HTTP/1.1
Host: site_address.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) 
Gecko/20080404 Firefox/2.0.0.14

Accept: image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://site_address.com/help/whskin_tbars.htm
Cookie: [removed]

HTTP/1.x 302 Moved Temporarily
Date: Thu, 05 Jun 2008 23:40:54 GMT
Location: 
http://site_address.com/gateway/index.cfm?fa=loginreturnURL=http%3A%2F%2Fsite_address%2Ecom%2Fhelp%2FFchr%5Fback%2Egif

Cache-Control: max-age=0
Expires: Thu, 05 Jun 2008 23:40:54 GMT
Content-Length: 422
Content-Type: text/html; charset=iso-8859-1
Connection: keep-alive


The redirection just has a Cache-Control: max-age=0, which allows the 
cache to store the response, and just requires that it be revalidated 
(which is done as evidenced by the TCP_REFRESH_HIT in the Squid log).


So, I'm still not seeing anything being cached against the server's 
request.  Try tailing the access log and grep for  200  and HIT** 
(note the spaces on either end of the 200).  That should show any 
objects (as opposed to redirects or errors) that are served from cache.


Chris

* The other URL (http://site_address.com/help/whskin_tbars.htm) it the 
referrer.

** tail -f /cache/logs/access.log | egrep 10.10.10.10.* 200 .*HIT





RE: [squid-users] How to not cache a site?

2008-06-09 Thread Jerome Yanga
Hendrik,

Yes.  They are objects that I found were cached by Squid.  However, it
was not suppose to be cached.

Regards,
Jerome

-Original Message-
From: Henrik Nordstrom [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 06, 2008 10:49 PM
To: Jerome Yanga
Cc: squid-users@squid-cache.org
Subject: RE: [squid-users] How to not cache a site?

On fre, 2008-06-06 at 15:48 -0700, Jerome Yanga wrote:

 I believe some do but others don't.  I just responded to Chris with
the
 http headers.  The captured log is a mere mouse over of an icon in the
 site.

Yes, but is those headers from an object which you found was cached by
Squid?

Regards
Henrik




Re: [squid-users] How to not cache a site?

2008-06-06 Thread Chris Robertson

Jerome Yanga wrote:

Thanks for the quick response, Chris.

Here are my attempts to answer your questions.  :)


Using Live HTTP Headers plugin for Firefox.  It seems to show that 
Cache-Control and Pragma settings.

http://site_address.com/help/jssamples_start.htm

GET /help/jssamples_start.htm HTTP/1.1
Host: site_address.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) 
Gecko/20080404 Firefox/2.0.0.14
Accept: 
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: CFID=1234567890; CFTOKEN=1234567890; SESSIONID=1234567890; 
__utma=.1.1.1.1.3; __utmc=1; 
__utmz=1.1.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); 
__utmb=1.4.10. 1

HTTP/1.x 200 OK
Date: Thu, 05 Jun 2008 23:41:00 GMT
Server: Apache
Last-Modified: Thu, 05 Jun 2008 09:03:27 GMT
Etag: 1-1-1
Accept-Ranges: bytes
Content-Type: text/html; charset=UTF-8
Cache-Control: no-store, no-cache, must-revalidate, max-age=0
Expires: Thu, 05 Jun 2008 23:41:00 GMT
  


These two lines (Cache-Control: no-store, and an Expires with the same 
time as the request) should stop any (compliant) shared cache from 
caching the content.  Have you modified the refresh_pattern in your 
squid.conf?



Vary: Accept-Encoding,User-Agent
Content-Encoding: gzip
Pragma: no-cache
Content-Length: 811
Connection: keep-alive


I purge the cache using a purge command.

#file /cache/usr/bin/purge
/cache/usr/bin/purge: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), 
for GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped

...and the syntax I use is below.

#/cache/usr/bin/purge -n -v -c /etc/squid/cachepurge.conf -p 127.0.0.1:80 -P 1 -e 
site_address\.com  /var/log/site_address.com_purge.log

I grep'ed the log created from the command above and I can find instances of 
site_address.com being deleted.  Hence, it is being cached.
  


Have you checked the headers returned with requests for those objects 
that are being cached?



I have also reviewed the access.log and I found a some TCP_MEM_HIT:NONE, 
TCP_REFRESH_HIT, TCP_IMS_HIT, TCP_HIT, TCP_REFRESH_MISS.
  


Same story here, have you verified the headers on these objects?  
Especially the objects that result in TCP_REFRESH_HIT and TCP_IMS_HIT as 
(I think) those are requests that are being validated with the origin 
server.



I cannot review the store.log as it is disabled.

I shall try the syntax you have provided on the next available downtime.

acl cacheDenyAclName dstdomain .site_address.com 
acl otherCacheDenyAclName urlpath_regex ^/help/ 
cache deny cacheDenyAclName otherCacheDenyAclName 


Thanks again, Chris.

Regards,
Jerome
  


Chris


RE: [squid-users] How to not cache a site?

2008-06-06 Thread Henrik Nordstrom
On tor, 2008-06-05 at 17:22 -0700, Jerome Yanga wrote:
 #/cache/usr/bin/purge -n -v -c /etc/squid/cachepurge.conf -p 127.0.0.1:80 -P 
 1 -e site_address\.com  /var/log/site_address.com_purge.log
 
 I grep'ed the log created from the command above and I can find instances of 
 site_address.com being deleted.  Hence, it is being cached.

And you are positively sure those objects do have the mentioned
Cache-Control headers?

Quite often there is different cache requirements for different kinds of
objects.

Regards
Henrik



RE: [squid-users] How to not cache a site?

2008-06-06 Thread Jerome Yanga
Henrik,

I believe some do but others don't.  I just responded to Chris with the
http headers.  The captured log is a mere mouse over of an icon in the
site.

I apologize for my noobness.

Regards,
Jerome

-Original Message-
From: Henrik Nordstrom [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 06, 2008 2:41 PM
To: Jerome Yanga
Cc: squid-users@squid-cache.org
Subject: RE: [squid-users] How to not cache a site?

On tor, 2008-06-05 at 17:22 -0700, Jerome Yanga wrote:
 #/cache/usr/bin/purge -n -v -c /etc/squid/cachepurge.conf -p
127.0.0.1:80 -P 1 -e site_address\.com 
/var/log/site_address.com_purge.log
 
 I grep'ed the log created from the command above and I can find
instances of site_address.com being deleted.  Hence, it is being cached.

And you are positively sure those objects do have the mentioned
Cache-Control headers?

Quite often there is different cache requirements for different kinds of
objects.

Regards
Henrik




RE: [squid-users] How to not cache a site?

2008-06-06 Thread Henrik Nordstrom
On fre, 2008-06-06 at 15:48 -0700, Jerome Yanga wrote:

 I believe some do but others don't.  I just responded to Chris with the
 http headers.  The captured log is a mere mouse over of an icon in the
 site.

Yes, but is those headers from an object which you found was cached by
Squid?

Regards
Henrik



Re: [squid-users] How to not cache a site?

2008-06-05 Thread Chris Robertson

Jerome Yanga wrote:

I have the following in my site_address.conf file.

  ExpiresDefault A0
  Header set Cache-Control no-store, no-cache, must-revalidate, max-age=0
  Header set Pragma no-cache
  


If these headers are indeed being set, Squid will not cache this content 
(without some effort).  Have you verified that these headers are being 
sent out?  If your site is internet accessible, you can use a hosted 
version of the cacheability engine (such as 
http://www.ircache.net/cgi-bin/cacheability.py), or you can download and 
set up a locally hosted version 
(http://www.mnot.net/cacheability/download.html), or you can look into 
the Live HTTP Headers plugin for Firefox 
(http://livehttpheaders.mozdev.org/).


However, this does not seem to work as whenever I perform a purge of the cache, I still see stuff being deleted.  


How (and why) are you purging the cache?  Are you sure the objects you 
are purging are ones that you have specified not be cached?  Have you 
checked your access.log to see if the requested objects are being served 
from cache, or the store.log to see if the objects are being cached at all?



I have been searching the web and I found the no_cache directive.


The no_cache directive was deprecated with the release of Squid 2.6.  It 
has been renamed cache in currently supported versions of Squid.



  I also found out that this directive is added into the squid.conf.  I cannot 
seem to find proper syntax definition for this directive.  I can only find 
examples which may not work for me.  Hence, I am posting a message for the 
first time.  Yes.  I am a noob.  Please be nice to me.  ☺

Nevertheless, given the following information, how do I use this directive?

Site:  www.site_address.com
  


Assuming you want to go this route (which will only affect your cache, 
and not ISP caches, or browser caches) to deny caching of the whole site 
you'd use something like...


acl cacheDenyAclName dstdomain .site_address.com


Note the leading dot on the domain name.  That will match the domain and 
all subdomains.




If I wanted to just make not cache a folder in this site, how would the syntax 
look?

Site  Folder:  www.site_address.com/help/
  


acl cacheDenyAclName dstdomain .site_address.com
acl otherCacheDenyAclName urlpath_regex ^/help/
cache deny cacheDenyAclName otherCacheDenyAclName

Here, I am using a combination of ACLs to reduce the load of using 
regular expressions.  If the host matches, then (and only then) the path 
is checked.  If both match, caching is denied. The following 
(non-exhaustive) list of URLs would be excluded from caching with this 
set up...


www.site_address.com/help/index.html
site_address.com/help/image.gif
webmail.site_address.com/help/me/figure/this/out.php


Please provide a syntax for each question.

By the way, if I am going about “no cache” the wrong way, please also indicate. 
 ☺
  


Indicated.  Make sure your server is sending out the headers you think 
it is.



Thank you in advance.

Regards,
Jyanga
  


Chris


RE: [squid-users] How to not cache a site?

2008-06-05 Thread Jerome Yanga
Thanks for the quick response, Chris.

Here are my attempts to answer your questions.  :)


Using Live HTTP Headers plugin for Firefox.  It seems to show that 
Cache-Control and Pragma settings.

http://site_address.com/help/jssamples_start.htm

GET /help/jssamples_start.htm HTTP/1.1
Host: site_address.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) 
Gecko/20080404 Firefox/2.0.0.14
Accept: 
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: CFID=1234567890; CFTOKEN=1234567890; SESSIONID=1234567890; 
__utma=.1.1.1.1.3; __utmc=1; 
__utmz=1.1.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); 
__utmb=1.4.10. 1

HTTP/1.x 200 OK
Date: Thu, 05 Jun 2008 23:41:00 GMT
Server: Apache
Last-Modified: Thu, 05 Jun 2008 09:03:27 GMT
Etag: 1-1-1
Accept-Ranges: bytes
Content-Type: text/html; charset=UTF-8
Cache-Control: no-store, no-cache, must-revalidate, max-age=0
Expires: Thu, 05 Jun 2008 23:41:00 GMT
Vary: Accept-Encoding,User-Agent
Content-Encoding: gzip
Pragma: no-cache
Content-Length: 811
Connection: keep-alive


I purge the cache using a purge command.

#file /cache/usr/bin/purge
/cache/usr/bin/purge: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), 
for GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped

...and the syntax I use is below.

#/cache/usr/bin/purge -n -v -c /etc/squid/cachepurge.conf -p 127.0.0.1:80 -P 1 
-e site_address\.com  /var/log/site_address.com_purge.log

I grep'ed the log created from the command above and I can find instances of 
site_address.com being deleted.  Hence, it is being cached.

I have also reviewed the access.log and I found a some TCP_MEM_HIT:NONE, 
TCP_REFRESH_HIT, TCP_IMS_HIT, TCP_HIT, TCP_REFRESH_MISS.

I cannot review the store.log as it is disabled.

I shall try the syntax you have provided on the next available downtime.

acl cacheDenyAclName dstdomain .site_address.com 
acl otherCacheDenyAclName urlpath_regex ^/help/ 
cache deny cacheDenyAclName otherCacheDenyAclName 

Thanks again, Chris.

Regards,
Jerome


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Thursday, June 05, 2008 1:45 PM
To: squid-users@squid-cache.org
Subject: Re: [squid-users] How to not cache a site?

Jerome Yanga wrote:
 I have the following in my site_address.conf file.

   ExpiresDefault A0
   Header set Cache-Control no-store, no-cache, must-revalidate, max-age=0
   Header set Pragma no-cache
   

If these headers are indeed being set, Squid will not cache this content 
(without some effort).  Have you verified that these headers are being 
sent out?  If your site is internet accessible, you can use a hosted 
version of the cacheability engine (such as 
http://www.ircache.net/cgi-bin/cacheability.py), or you can download and 
set up a locally hosted version 
(http://www.mnot.net/cacheability/download.html), or you can look into 
the Live HTTP Headers plugin for Firefox 
(http://livehttpheaders.mozdev.org/).

 However, this does not seem to work as whenever I perform a purge of the 
 cache, I still see stuff being deleted.  

How (and why) are you purging the cache?  Are you sure the objects you 
are purging are ones that you have specified not be cached?  Have you 
checked your access.log to see if the requested objects are being served 
from cache, or the store.log to see if the objects are being cached at all?

 I have been searching the web and I found the no_cache directive.

The no_cache directive was deprecated with the release of Squid 2.6.  It 
has been renamed cache in currently supported versions of Squid.

   I also found out that this directive is added into the squid.conf.  I 
 cannot seem to find proper syntax definition for this directive.  I can only 
 find examples which may not work for me.  Hence, I am posting a message for 
 the first time.  Yes.  I am a noob.  Please be nice to me.  ☺

 Nevertheless, given the following information, how do I use this directive?

 Site:  www.site_address.com
   

Assuming you want to go this route (which will only affect your cache, 
and not ISP caches, or browser caches) to deny caching of the whole site 
you'd use something like...

acl cacheDenyAclName dstdomain .site_address.com


Note the leading dot on the domain name.  That will match the domain and 
all subdomains.


 If I wanted to just make not cache a folder in this site, how would the 
 syntax look?

 Site  Folder:  www.site_address.com/help/
   

acl cacheDenyAclName dstdomain .site_address.com
acl otherCacheDenyAclName urlpath_regex ^/help/
cache deny cacheDenyAclName otherCacheDenyAclName

Here, I am using a combination of ACLs to reduce the load of using 
regular expressions