Re: [squid-users] Objects Release from Cache Earlier Than Expected
After some further investigation, it seem that RELEASE does not mean that Squid deletes the object from cache. It appears that it releases from cache to the request. To restate the problem I am having: Squid seems to re-fetch the entire object even though the object never changed on the server after about 1 day - 2 days. Here's my test scenario: Object is initially cached. Max age in squid.conf is set to 1 min. Before 1 min passes, I request the object and Squid returns TCP_HIT. After 1 min, I try to request for object again. Squid returns TCP_REFRESH_HIT, which is what I expect. I leave the entire system untouched. A day or a day and a half later, I ask for the object again and Squid returns TCP_REFRESH_MISS/200. What could possibly cause Squid to refetch the entire object again? Could there possibly be a problem with the interaction between IE7 and Squid that is forcing Squid to re-fetch the entire object? Anyone with ideas on why this behavior occurs? Thanks - Original Message From: BUI18 [EMAIL PROTECTED] To: Henrik Nordstrom [EMAIL PROTECTED] Cc: squid-users@squid-cache.org Sent: Tuesday, October 21, 2008 8:25:14 AM Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected The web server is IIS 6. 1) Would there be any reason why it would return the full object when in fact the object has not been modified? 2) If the min age guarantees freshness of the object, why would Squid actually issue and IMS request to the web server in the first place? As I understand it, Squid should only issue and IMS request when objects become STALE. As such, I would have expected Squid to return TCP_HIT instead of TCP_REFRESH_MISS. 3) My big concern is that the store.log shows that the object was released (deleted) from cache well before the min age while there is still and abundant amount of disk space is available. Also, one other question: When Squid issues and IMS request, which date does it use? Is it the date/time that it retrieved the object or is it the Last Modified date/time of the object ascertained by Squid on first retrieval of the object? Regards -bui - Original Message From: Henrik Nordstrom [EMAIL PROTECTED] To: BUI18 [EMAIL PROTECTED] Cc: squid-users@squid-cache.org Sent: Tuesday, October 21, 2008 3:50:20 AM Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected On mån, 2008-10-20 at 17:45 -0700, BUI18 wrote: I not sure what you mean by a newer copy of the same URL? Can you elaborate on that a bit? The cache (i.e. Squid) performed a conditional request to the origin web server, and the web server returned a new 200 OK object with full content instead of a small 304 Not Modified. Regards Henrik
Re: [squid-users] Objects Release from Cache Earlier Than Expected
Henrik - Thanks for taking time out to respond to my questions. I'm completely stumped on this one. In our production environment, we set min and max to 5 and 7 days, respectively. As I understand it, if the request is made for the object in say3 days or 4 days (less than 5 days), I would always expect a TCP_HIT. But again, after 1 to 2 days, I see TCP_REFRESH_MISS and I get the whole object. I thought that by setting the min to 5 days would guarantee freshness up to 5 days. Do you know of a problem that maybe causes Squid to ignore the rules on determining whether an object is fresh? We used fiddler and actually removed the If-Modified-Since part of the request and still we get TCP_REFRESH_MISS. Do you have any other ideas on areas we might want to check to see what could possibly be causing this behavior? Thanks - Original Message From: Henrik Nordstrom [EMAIL PROTECTED] To: BUI18 [EMAIL PROTECTED] Cc: squid-users@squid-cache.org Sent: Wednesday, October 22, 2008 4:06:33 PM Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected On ons, 2008-10-22 at 14:35 -0700, BUI18 wrote: Object is initially cached. Max age in squid.conf is set to 1 min. Before 1 min passes, I request the object and Squid returns TCP_HIT. After 1 min, I try to request for object again. Squid returns TCP_REFRESH_HIT, which is what I expect. I leave the entire system untouched. A day or a day and a half later, I ask for the object again and Squid returns TCP_REFRESH_MISS/200. TCP_HIT is a local hit on the Squid cache. Origin server was not asked. TCP_REFRESH_HIT is a cache hit after the origin server was asked if the object is still fresh. TCP_REFREHS_MISS is when the origin server says the object is no longer fresh and returns a new copy on the conditional query sent by the cache. (same query as in TCP_REFRESH_HIT, different response from the web server). What could possibly cause Squid to refetch the entire object again? A better question is why your server responds with the entire object on a If-Modified-Since type query if it hasn't been modified. It should have responded with a 304 response as it did in the TCP_REFRESH_HIT case. Regards Henrik
Re: [squid-users] Objects Release from Cache Earlier Than Expected
But why would Squid even issue an If-Modified-Since to origin server if the min value is set to 5 days? Would this object not be seen as fresh and would just be served up by Squid as a TCP_HIT? - Original Message From: Henrik Nordstrom [EMAIL PROTECTED] To: BUI18 [EMAIL PROTECTED] Cc: squid-users@squid-cache.org Sent: Wednesday, October 22, 2008 7:19:51 PM Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected I am talking about If-Modified-Since between Squid and the web server, not browser-squid. On ons, 2008-10-22 at 17:57 -0700, BUI18 wrote: Henrik - Thanks for taking time out to respond to my questions. I'm completely stumped on this one. In our production environment, we set min and max to 5 and 7 days, respectively. As I understand it, if the request is made for the object in say3 days or 4 days (less than 5 days), I would always expect a TCP_HIT. But again, after 1 to 2 days, I see TCP_REFRESH_MISS and I get the whole object. I thought that by setting the min to 5 days would guarantee freshness up to 5 days. Do you know of a problem that maybe causes Squid to ignore the rules on determining whether an object is fresh? We used fiddler and actually removed the If-Modified-Since part of the request and still we get TCP_REFRESH_MISS. Do you have any other ideas on areas we might want to check to see what could possibly be causing this behavior? Thanks - Original Message From: Henrik Nordstrom [EMAIL PROTECTED] To: BUI18 [EMAIL PROTECTED] Cc: squid-users@squid-cache.org Sent: Wednesday, October 22, 2008 4:06:33 PM Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected On ons, 2008-10-22 at 14:35 -0700, BUI18 wrote: Object is initially cached. Max age in squid.conf is set to 1 min. Before 1 min passes, I request the object and Squid returns TCP_HIT. After 1 min, I try to request for object again. Squid returns TCP_REFRESH_HIT, which is what I expect. I leave the entire system untouched. A day or a day and a half later, I ask for the object again and Squid returns TCP_REFRESH_MISS/200. TCP_HIT is a local hit on the Squid cache. Origin server was not asked. TCP_REFRESH_HIT is a cache hit after the origin server was asked if the object is still fresh. TCP_REFREHS_MISS is when the origin server says the object is no longer fresh and returns a new copy on the conditional query sent by the cache. (same query as in TCP_REFRESH_HIT, different response from the web server). What could possibly cause Squid to refetch the entire object again? A better question is why your server responds with the entire object on a If-Modified-Since type query if it hasn't been modified. It should have responded with a 304 response as it did in the TCP_REFRESH_HIT case. Regards Henrik
Re: [squid-users] Objects Release from Cache Earlier Than Expected
The web server is IIS 6. 1) Would there be any reason why it would return the full object when in fact the object has not been modified? 2) If the min age guarantees freshness of the object, why would Squid actually issue and IMS request to the web server in the first place? As I understand it, Squid should only issue and IMS request when objects become STALE. As such, I would have expected Squid to return TCP_HIT instead of TCP_REFRESH_MISS. 3) My big concern is that the store.log shows that the object was released (deleted) from cache well before the min age while there is still and abundant amount of disk space is available. Also, one other question: When Squid issues and IMS request, which date does it use? Is it the date/time that it retrieved the object or is it the Last Modified date/time of the object ascertained by Squid on first retrieval of the object? Regards -bui - Original Message From: Henrik Nordstrom [EMAIL PROTECTED] To: BUI18 [EMAIL PROTECTED] Cc: squid-users@squid-cache.org Sent: Tuesday, October 21, 2008 3:50:20 AM Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected On mån, 2008-10-20 at 17:45 -0700, BUI18 wrote: I not sure what you mean by a newer copy of the same URL? Can you elaborate on that a bit? The cache (i.e. Squid) performed a conditional request to the origin web server, and the web server returned a new 200 OK object with full content instead of a small 304 Not Modified. Regards Henrik
[squid-users] Objects Release from Cache Earlier Than Expected
Hi - I have been trying to track down an issue with Squid 2.6 STABLE18 and why users were getting TCP_REFRESH_MISS instead of TCP_REFRESH_HIT on files that were recently cached. We first noticed that users were getting misses when we expected them to receive hits. I have set the min and max age to be 5 and 7 days respectively. When I look in the store.log file, I do see objects which were known to have been cached today (base on time/date stamp in the file name), yet they have status code of RELEASE. The cache_dir (1 TB) on this system is only 25% full. The low watermark is set at 95% with high at 97%. Does any one have any ideas on why Squid would appear to purge the object earlier than expected? Thanks in advance. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [squid-users] Objects Release from Cache Earlier Than Expected
Hi - Here's some additional info I noticed from the store.log. 1224524455.351 SWAPOUT 00 0003A6CB 7377CBD1A7584A5D7C7FD06B5B827595 200 1224524431 1224522501-1 video/jpeg 1337100/1337100 GET http://ftp.mydomain.com/myserver/websites/data/MyVideoFile1020130441180.vid 1224544851.517 RELEASE 00 0003A6CB CD5B96F66CC94483D586D7E67A76A94C 200 1224524431 1224522501-1 video/jpeg 1337100/-279 GET http://ftp.mydomain.com/myserver/websites/data/MyVideoFile1020130441180.vid 1224544862.563 SWAPOUT 00 0003CA26 7377CBD1A7584A5D7C7FD06B5B827595 200 1224544840 1224522501-1 video/jpeg 1337100/1337100 GET http://ftp.mydomain.com/myserver/websites/data/MyVideoFile1020130441180.vid Trace breaks down as follows. 1) File was first pre-fetched by wget program. 2) File was released by cache. 3) File was re-fetch when a user tried to access it. The thing that stuck out is that in the RELEASE line of the log, the real-length is a negative number (-279). What does this mean exactly? - Original Message From: BUI18 [EMAIL PROTECTED] To: squid-users@squid-cache.org Sent: Monday, October 20, 2008 4:02:52 PM Subject: [squid-users] Objects Release from Cache Earlier Than Expected Hi - I have been trying to track down an issue with Squid 2.6 STABLE18 and why users were getting TCP_REFRESH_MISS instead of TCP_REFRESH_HIT on files that were recently cached. We first noticed that users were getting misses when we expected them to receive hits. I have set the min and max age to be 5 and 7 days respectively. When I look in the store.log file, I do see objects which were known to have been cached today (base on time/date stamp in the file name), yet they have status code of RELEASE. The cache_dir (1 TB) on this system is only 25% full. The low watermark is set at 95% with high at 97%. Does any one have any ideas on why Squid would appear to purge the object earlier than expected? Thanks in advance. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [squid-users] Objects Release from Cache Earlier Than Expected
I not sure what you mean by a newer copy of the same URL? Can you elaborate on that a bit? As far as I know, the aspx pages displays a list of buttons for each video file. When the user clicks on the button, it references the URL. I've seen it where the user click the link and gets a TCP_REFRESH_HIT, but if I come back a day later (well within my min/max settings), I get a TCP_REFRESH_MISS. I also previously posted additional info from the store.log. Which shows the object being cached and then released after a short time. - Original Message From: Henrik Nordstrom [EMAIL PROTECTED] To: BUI18 [EMAIL PROTECTED] Cc: squid-users@squid-cache.org Sent: Monday, October 20, 2008 4:55:41 PM Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected On mån, 2008-10-20 at 16:02 -0700, BUI18 wrote: Hi - I have been trying to track down an issue with Squid 2.6 STABLE18 and why users were getting TCP_REFRESH_MISS instead of TCP_REFRESH_HIT on files that were recently cached. We first noticed that users were getting misses when we expected them to receive hits. TCP_REFRESH_MISS is a cache validation which indicated the object has been updated on the origin server. I have set the min and max age to be 5 and 7 days respectively. When I look in the store.log file, I do see objects which were known to have been cached today (base on time/date stamp in the file name), yet they have status code of RELEASE. And you are sure it wasn't simply replaced with a newer copy of the same URL? Regards Henrik __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [squid-users] Object becomes STALE: refresh_pattern min and max
Sorry it took a while to get back. Not sure how to interpre X-Cache and X-Cache-Lookup. Here's the header info from Fiddler: Request Header GET /server1/websites/data/folder/myvideofile.vid HTTP/1.1 Client Accept: */* Transport Host: ftp.mydomain.com Proxy-Connection: Keep-Alive Response Header HTTP/1.0 200 OK Content-Length: 1775372 Content-Type: video/jpeg Last-Modified: Tue, 02 Sep 2008 23:57:25 GMT Accept-Ranges: none ETag: 8020b2a557dc91:3ecc Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET Date: Thu, 16 Oct 2008 22:52:49 GMT X-Cache: MISS from squid.mydomain.com X-Cache-Lookup: HIT from squid.mydomain.com:3128 Via: 1.0 squid.mydomain.com:3128 (squid/2.6.STABLE14) Proxy-Connection: keep-alive - Original Message From: Amos Jeffries [EMAIL PROTECTED] To: BUI18 [EMAIL PROTECTED] Cc: squid-users@squid-cache.org; Itzcak Pechtalt [EMAIL PROTECTED] Sent: Wednesday, September 24, 2008 6:17:22 AM Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max BUI18 wrote: Hi - Thanks for responding. URL for video file never changes. What release of Squid? Did you check the Expires header properly from the transfer rather than from the (apparently untrustworthy) info in the store log? I did some more checking in the Squid logs and this is what I noticed: File Properties of video file (Pacific Daylight Time (PDT)) Created On: Monday, September 22, 2008, 8:59:35 AM Modified On: Monday, September 22, 2008, 8:59:35 AM Accessed On: Today, September 24, 2008, 3:53:12 AM *** Wget Grabs File (Time in India Standard Time (IST)) --04:38:35-- http://ftp.mydomain.com/websites/data/myvideofile.vid = `/WGET/Temp/myvideofile.vid' 04:38:54 (93.91 KB/s) - `/WGET/Temp/myvideofile.vid' saved [1791244/1791244] The access.log confirms initial pre-fetch by wget. 1222124934.241 18968 192.168.200.4 TCP_MISS/200 1791684 GET http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 video/jpeg UTC = Mon, 22 Sep 2008 23:08:54 GMT The store.log shows a write from memory to disk: 1222124934.241 SWAPOUT 00 00057B65 1E18E35BDC9307C6BC3FBEFD5B4120A3 200 1222124765 1222099175-1 video/jpeg 1791244/1791244 GET http://ftp.mydomain.com/websites/data/myvideofile.vid UTC = Mon, 22 Sep 2008 23:08:54 GMT *** Then Store.log shows release or removal from cache: 153725.068 RELEASE 00 00057B65 605FAC36E93B0CDE81902BBC6C5EC71A 200 1222124765 1222099175-1 video/jpeg 1791244/-279 GET http://ftp.mydomain.com/websites/data/myvideofile.vid UTC = Wed, 24 Sep 2008 10:55:25 GMT Notice the -1 for expiration header (I do not set one on the object). My min age is 5 days so I'm not sure why the object would be released from cache in less than 2 days. If the object was released from cache, when the user tried to access file, Squid reports TCP_REFRESH_MISS, which to me means that it was found in cache but when it sends a If-Modified-Since request, it thinks that the file has been modified (which it was not as seen by the lastmod date indicated in the store.log below. *** User accessed file (access.log): 153742.005 17275 192.168.200.52 TCP_REFRESH_MISS/200 1791688 GET http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 video/jpeg UTC = Wed, 24 Sep 2008 10:55:42 GMT Then store.log shows a write to disk 153742.005 SWAPOUT 00 00088336 1E18E35BDC9307C6BC3FBEFD5B4120A3 200 153575 1222099175-1 video/jpeg 1791244/1791244 GET http://ftp.mydomain.com/websites/data/myvideofile.vid UTC = Wed, 24 Sep 2008 10:55:42 GMT datehdr: Wed, 24 Sep 2008 10:55:55 GMT lastmod: Mon, 22 Sep 2008 15:59:35 GMT Anyone with ideas on why this behavior occurs? thanks - Original Message From: Itzcak Pechtalt [EMAIL PROTECTED] To: Squid Users squid-users@squid-cache.org Sent: Wednesday, September 24, 2008 4:35:59 AM Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max On Wed, Sep 24, 2008 at 1:39 PM, BUI18 [EMAIL PROTECTED] wrote: Hi - I have squid box with tons of disk for the cache_dir (hundreds of GB). I use wget to perform some pre-fetching of large video files. I've set the min and max age to 5 days and 7 days (in minutes). And although I have plenty of disk space available, I still receive TCP_REFRESH_MISS for files that had been pre-fetched and later accessed the same day. Does anyone know why Squid would consider it as STALE? I thought that by setting the min value for refresh_pattern for the video file would guarantee freshness. Not only does the cache consider it STALE, it then goes and pre-fetches a new copy even though I know
[squid-users] Object becomes STALE: refresh_pattern min and max
Hi - I have squid box with tons of disk for the cache_dir (hundreds of GB). I use wget to perform some pre-fetching of large video files. I've set the min and max age to 5 days and 7 days (in minutes). And although I have plenty of disk space available, I still receive TCP_REFRESH_MISS for files that had been pre-fetched and later accessed the same day. Does anyone know why Squid would consider it as STALE? I thought that by setting the min value for refresh_pattern for the video file would guarantee freshness. Not only does the cache consider it STALE, it then goes and pre-fetches a new copy even though I know that the video file has not changed. Any help would be greatly appreciated. Thanks.
[squid-users] How to Cache aspx Pages?
Hi - I need to cache aspx pages. I have read through the squid faq for caching dynamic content and have tried the following configuration, however, it does not seem to cache aspx pages with or without query strings. A sample link may look like this (this is not a working link) -- http://www.domain.com/junk.aspx?id=12345 My current relevant configurations from squid.conf: # Allows dynamic content with query strings in the path acl junkname urlpath_regex -i \? cache allow junkname hierarchy_stoplist cgi-bin ? acl QUERY urlpath_regex cgi-bin \? cache deny QUERY refresh_pattern -i \.aspx$ 144090% 2880 ignore-reload I get status of TCP_MISS in cache.log when I request the page. Am I missing something? Do I need an entry for http_access as well? Does anyone see a problem with the above configuration parameters? Does any one know how to properly cache aspx pages? Thanks in advance.
Re: [squid-users] Object becomes STALE: refresh_pattern min and max
Hi - Thanks for responding. URL for video file never changes. I did some more checking in the Squid logs and this is what I noticed: File Properties of video file (Pacific Daylight Time (PDT)) Created On: Monday, September 22, 2008, 8:59:35 AM Modified On: Monday, September 22, 2008, 8:59:35 AM Accessed On: Today, September 24, 2008, 3:53:12 AM *** Wget Grabs File (Time in India Standard Time (IST)) --04:38:35-- http://ftp.mydomain.com/websites/data/myvideofile.vid = `/WGET/Temp/myvideofile.vid' 04:38:54 (93.91 KB/s) - `/WGET/Temp/myvideofile.vid' saved [1791244/1791244] The access.log confirms initial pre-fetch by wget. 1222124934.241 18968 192.168.200.4 TCP_MISS/200 1791684 GET http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 video/jpeg UTC = Mon, 22 Sep 2008 23:08:54 GMT The store.log shows a write from memory to disk: 1222124934.241 SWAPOUT 00 00057B65 1E18E35BDC9307C6BC3FBEFD5B4120A3 200 1222124765 1222099175-1 video/jpeg 1791244/1791244 GET http://ftp.mydomain.com/websites/data/myvideofile.vid UTC = Mon, 22 Sep 2008 23:08:54 GMT *** Then Store.log shows release or removal from cache: 153725.068 RELEASE 00 00057B65 605FAC36E93B0CDE81902BBC6C5EC71A 200 1222124765 1222099175-1 video/jpeg 1791244/-279 GET http://ftp.mydomain.com/websites/data/myvideofile.vid UTC = Wed, 24 Sep 2008 10:55:25 GMT Notice the -1 for expiration header (I do not set one on the object). My min age is 5 days so I'm not sure why the object would be released from cache in less than 2 days. If the object was released from cache, when the user tried to access file, Squid reports TCP_REFRESH_MISS, which to me means that it was found in cache but when it sends a If-Modified-Since request, it thinks that the file has been modified (which it was not as seen by the lastmod date indicated in the store.log below. *** User accessed file (access.log): 153742.005 17275 192.168.200.52 TCP_REFRESH_MISS/200 1791688 GET http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 video/jpeg UTC = Wed, 24 Sep 2008 10:55:42 GMT Then store.log shows a write to disk 153742.005 SWAPOUT 00 00088336 1E18E35BDC9307C6BC3FBEFD5B4120A3 200 153575 1222099175-1 video/jpeg 1791244/1791244 GET http://ftp.mydomain.com/websites/data/myvideofile.vid UTC = Wed, 24 Sep 2008 10:55:42 GMT datehdr: Wed, 24 Sep 2008 10:55:55 GMT lastmod: Mon, 22 Sep 2008 15:59:35 GMT Anyone with ideas on why this behavior occurs? thanks - Original Message From: Itzcak Pechtalt [EMAIL PROTECTED] To: Squid Users squid-users@squid-cache.org Sent: Wednesday, September 24, 2008 4:35:59 AM Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max On Wed, Sep 24, 2008 at 1:39 PM, BUI18 [EMAIL PROTECTED] wrote: Hi - I have squid box with tons of disk for the cache_dir (hundreds of GB). I use wget to perform some pre-fetching of large video files. I've set the min and max age to 5 days and 7 days (in minutes). And although I have plenty of disk space available, I still receive TCP_REFRESH_MISS for files that had been pre-fetched and later accessed the same day. Does anyone know why Squid would consider it as STALE? I thought that by setting the min value for refresh_pattern for the video file would guarantee freshness. Not only does the cache consider it STALE, it then goes and pre-fetches a new copy even though I know that the video file has not changed. Any help would be greatly appreciated. Thanks. Hi, Check if the video URL changes from request to request. In YouTube video even if the main URL is the same, there is request ID in URL who changes per request. Itzcak
Re: [squid-users] Object becomes STALE: refresh_pattern min and max
Hi - I went through your same thinking as you described below. I checked the Expires header from the server and we do not set one. I checked via Fiddler web debug tool. I also verified with the dev guys here regarding no Expires header. I have set the min and max via refresh_pattern because of the absence of the Expires header thinking that Squid would keep it FRESH. I recently posted the details of the Squid logs, but will re-post it here so that the thread will follow properly (it may help others when search). Hoping someone could explain the behavior I see below. File Properties of video file (Pacific Daylight Time (PDT)) Created On: Monday, September 22, 2008, 8:59:35 AM Modified On: Monday, September 22, 2008, 8:59:35 AM Accessed On: Today, September 24, 2008, 3:53:12 AM *** Wget Grabs File (Time in India Standard Time (IST)) --04:38:35-- http://ftp.mydomain.com/websites/data/myvideofile.vid = `/WGET/Temp/myvideofile.vid' 04:38:54 (93.91 KB/s) - `/WGET/Temp/myvideofile.vid' saved [1791244/1791244] The access.log confirms initial pre-fetch by wget. 1222124934.241 18968 192.168.200.4 TCP_MISS/200 1791684 GET http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 video/jpeg UTC = Mon, 22 Sep 2008 23:08:54 GMT The store.log shows a write from memory to disk: 1222124934.241 SWAPOUT 00 00057B65 1E18E35BDC9307C6BC3FBEFD5B4120A3 200 1222124765 1222099175-1 video/jpeg 1791244/1791244 GET http://ftp.mydomain.com/websites/data/myvideofile.vid UTC = Mon, 22 Sep 2008 23:08:54 GMT *** Then Store.log shows release or removal from cache: 153725.068 RELEASE 00 00057B65 605FAC36E93B0CDE81902BBC6C5EC71A 200 1222124765 1222099175-1 video/jpeg 1791244/-279 GET http://ftp.mydomain.com/websites/data/myvideofile.vid UTC = Wed, 24 Sep 2008 10:55:25 GMT Notice the -1 for expiration header (I do not set one on the object). My min age is 5 days so I'm not sure why the object would be released from cache in less than 2 days. If the object was released from cache, when the user tried to access file, Squid reports TCP_REFRESH_MISS, which to me means that it was found in cache but when it sends a If-Modified-Since request, it thinks that the file has been modified (which it was not as seen by the lastmod date indicated in the store.log below. *** User accessed file (access.log): 153742.005 17275 192.168.200.52 TCP_REFRESH_MISS/200 1791688 GET http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 video/jpeg UTC = Wed, 24 Sep 2008 10:55:42 GMT Then store.log shows a write to disk 153742.005 SWAPOUT 00 00088336 1E18E35BDC9307C6BC3FBEFD5B4120A3 200 153575 1222099175-1 video/jpeg 1791244/1791244 GET http://ftp.mydomain.com/websites/data/myvideofile.vid UTC = Wed, 24 Sep 2008 10:55:42 GMT datehdr: Wed, 24 Sep 2008 10:55:55 GMT lastmod: Mon, 22 Sep 2008 15:59:35 GMT - Original Message From: Michael Alger [EMAIL PROTECTED] To: squid-users@squid-cache.org Sent: Wednesday, September 24, 2008 4:49:38 AM Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max On Wed, Sep 24, 2008 at 03:39:16AM -0700, BUI18 wrote: I have squid box with tons of disk for the cache_dir (hundreds of GB). I use wget to perform some pre-fetching of large video files. I've set the min and max age to 5 days and 7 days (in minutes). And although I have plenty of disk space available, I still receive TCP_REFRESH_MISS for files that had been pre-fetched and later accessed the same day. Does anyone know why Squid would consider it as STALE? I thought that by setting the min value for refresh_pattern for the video file would guarantee freshness. Not only does the cache consider it STALE, it then goes and pre-fetches a new copy even though I know that the video file has not changed. Any help would be greatly appreciated. Thanks. The fact that it's doing TCP_REFRESH_xxx means squid does have a cached copy which it considers potentially stale. So it's sending an If-Modified-Since request to the origin server. The origin is then either saying yes, it's been modified since you retrieved it -- here's a new one; or it has no idea how to handle IMS and is sending the whole object regardless. What Expires: header is the server sending? You can use the -S switch with wget to show the server response headers when you're doing your pre-fetch. If they look fine, maybe keep a copy of them and compare later in the day when squid decides it needs a new one. I assume you're setting the min and max age via refresh_pattern lines? Remember that these are only used in the absence of an Expires header.
Re: [squid-users] Object becomes STALE: refresh_pattern min and max
My Squid Version is 2.6/STABLE14 Here's my refresh_pattern from squid.conf #Suggested default: refresh_pattern ^ftp: 144020% 10080 refresh_pattern ^gopher:14400% 1440 #The following line will ignore a client no-cache header #refresh_pattern -i \.vid$ 0 90% 2880 ignore-reload refresh_pattern -i \.vid$ 7200100%10080 ignore-reload refresh_pattern . 0 20% 4320 A link to the file looks something like this -- http://ftp.mydomain.com/websites/data/myvideofile.vid I have to set up a station to grab the header but I can tell you that it does not seem out of the ordinary. There is one cache-control: Pragma: no-cache I believe I handle this with the ignore-reload options. Our server is an IIS server running on Windows 2003. I also ran a test with min and max age of 0 and 1 respectively, and it seems to work. I receive a TCP_REFRESH_HIT, which is what I would have expected as these files do not change. Please let me know if you have any other ideas on how to track down why it would release from cache before min age with no Expiration set on the object. Open to any suggestions. Thanks - Original Message From: Michael Alger [EMAIL PROTECTED] To: squid-users@squid-cache.org Sent: Wednesday, September 24, 2008 8:09:50 AM Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max On Wed, Sep 24, 2008 at 05:29:52AM -0700, BUI18 wrote: I went through your same thinking as you described below. I checked the Expires header from the server and we do not set one. I checked via Fiddler web debug tool. I also verified with the dev guys here regarding no Expires header. I have set the min and max via refresh_pattern because of the absence of the Expires header thinking that Squid would keep it FRESH. Notice the -1 for expiration header (I do not set one on the object). My min age is 5 days so I'm not sure why the object would be released from cache in less than 2 days. If the object was released from cache, when the user tried to access file, Squid reports TCP_REFRESH_MISS, which to me means that it was found in cache but when it sends a If-Modified-Since request, it thinks that the file has been modified (which it was not as seen by the lastmod date indicated in the store.log below. Interesting that it's caching the file for 2 days. What are the full headers returned with the object? Any other cache control headers? Is there any chance you have a conflicting refresh_pattern, so the freshness rules being applied aren't the ones you're expecting? May be worth doing some tests with very small max ages to confirm it's matching the right rule.