Re: [squid-users] Objects Release from Cache Earlier Than Expected

2008-10-22 Thread BUI18
After some further investigation, it seem that RELEASE does not mean that Squid 
deletes the object from cache.  It appears that it releases from cache to the 
request.

To restate the problem I am having:

Squid seems to re-fetch the entire object even though the object never changed 
on the server after about 1 day - 2 days.

Here's my test scenario:

Object is initially cached.  Max age in squid.conf is set to 1 min.  Before 1 
min passes, I request the object and Squid returns TCP_HIT.  After 1 min, I try 
to request for object again.  Squid returns TCP_REFRESH_HIT, which is what I 
expect.  I leave the entire system untouched.  A day or a day and a half later, 
I ask for the object again and Squid returns TCP_REFRESH_MISS/200.

What could possibly cause Squid to refetch the entire object again?

Could there possibly be a problem with the interaction between IE7 and Squid 
that is forcing Squid to re-fetch the entire object?

Anyone with ideas on why this behavior occurs?

Thanks



- Original Message 
From: BUI18 [EMAIL PROTECTED]
To: Henrik Nordstrom [EMAIL PROTECTED]
Cc: squid-users@squid-cache.org
Sent: Tuesday, October 21, 2008 8:25:14 AM
Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected

The web server is IIS 6.

1)  Would there be any reason why it would return the full object when in fact 
the object has not been modified?
2)  If the min age guarantees freshness of the object, why would Squid actually 
issue and IMS request to the web server in the first place?  As I understand 
it, Squid should only issue and IMS request when objects become STALE.  As 
such, I would have expected Squid to return TCP_HIT instead of TCP_REFRESH_MISS.
3)  My big concern is that the store.log shows that the object was released 
(deleted) from cache well before the min age while there is still and abundant 
amount of disk space is available.

Also, one other question:

When Squid issues and IMS request, which date does it use?  Is it the date/time 
that it retrieved the object or is it the Last Modified date/time of the object 
ascertained by Squid on first retrieval of the object?

Regards
-bui



- Original Message 
From: Henrik Nordstrom [EMAIL PROTECTED]
To: BUI18 [EMAIL PROTECTED]
Cc: squid-users@squid-cache.org
Sent: Tuesday, October 21, 2008 3:50:20 AM
Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected

On mån, 2008-10-20 at 17:45 -0700, BUI18 wrote:
 I not sure what you mean by a newer copy of the same URL?  Can you elaborate 
 on that a bit?

The cache (i.e. Squid) performed a conditional request to the origin web
server, and the web server returned a new 200 OK object with full
content instead of a small 304 Not Modified.

Regards
Henrik





Re: [squid-users] Objects Release from Cache Earlier Than Expected

2008-10-22 Thread BUI18
Henrik -  Thanks for taking time out to respond to my questions.  I'm 
completely stumped on this one.

In our production environment, we set min and max to 5 and 7 days, respectively.

As I understand it, if the request is made for the object in say3 days or 4 
days (less than 5 days), I would always expect a TCP_HIT.

But again, after 1 to 2 days, I see TCP_REFRESH_MISS and I get the whole object.

I thought that by setting the min to 5 days would guarantee freshness up to 5 
days.

Do you know of a problem that maybe causes Squid to ignore the rules on 
determining whether an object is fresh?

We used fiddler and actually removed the If-Modified-Since part of the 
request and still we get TCP_REFRESH_MISS.

Do you have any other ideas on areas we might want to check to see what could 
possibly be causing this behavior?

Thanks





- Original Message 
From: Henrik Nordstrom [EMAIL PROTECTED]
To: BUI18 [EMAIL PROTECTED]
Cc: squid-users@squid-cache.org
Sent: Wednesday, October 22, 2008 4:06:33 PM
Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected

On ons, 2008-10-22 at 14:35 -0700, BUI18 wrote:

 Object is initially cached.  Max age in squid.conf is set to 1 min.
 Before 1 min passes, I request the object and Squid returns TCP_HIT.
 After 1 min, I try to request for object again.  Squid returns
 TCP_REFRESH_HIT, which is what I expect.  I leave the entire system
 untouched.  A day or a day and a half later, I ask for the object
 again and Squid returns TCP_REFRESH_MISS/200.


TCP_HIT is a local hit on the Squid cache. Origin server was not asked.

TCP_REFRESH_HIT is a cache hit after the origin server was asked if the
object is still fresh.

TCP_REFREHS_MISS is when the origin server says the object is no longer
fresh and returns a new copy on the conditional query sent by the cache.
(same query as in TCP_REFRESH_HIT, different response from the web
server).

 What could possibly cause Squid to refetch the entire object again?

A better question is why your server responds with the entire object on
a If-Modified-Since type query if it hasn't been modified. It should
have responded with a 304 response as it did in the TCP_REFRESH_HIT
case.

Regards
Henrik



  


Re: [squid-users] Objects Release from Cache Earlier Than Expected

2008-10-22 Thread BUI18
But why would Squid even issue an If-Modified-Since to origin server if the 
min value is set to 5 days?  Would this object not be seen as fresh and would 
just be served up by Squid as a TCP_HIT?







- Original Message 
From: Henrik Nordstrom [EMAIL PROTECTED]
To: BUI18 [EMAIL PROTECTED]
Cc: squid-users@squid-cache.org
Sent: Wednesday, October 22, 2008 7:19:51 PM
Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected

I am talking about If-Modified-Since between Squid and the web server,
not browser-squid.


On ons, 2008-10-22 at 17:57 -0700, BUI18 wrote:
 Henrik -  Thanks for taking time out to respond to my questions.  I'm 
 completely stumped on this one.
 
 In our production environment, we set min and max to 5 and 7 days, 
 respectively.
 
 As I understand it, if the request is made for the object in say3 days or 
 4 days (less than 5 days), I would always expect a TCP_HIT.
 
 But again, after 1 to 2 days, I see TCP_REFRESH_MISS and I get the whole 
 object.
 
 I thought that by setting the min to 5 days would guarantee freshness up to 5 
 days.
 
 Do you know of a problem that maybe causes Squid to ignore the rules on 
 determining whether an object is fresh?
 
 We used fiddler and actually removed the If-Modified-Since part of the 
 request and still we get TCP_REFRESH_MISS.
 
 Do you have any other ideas on areas we might want to check to see what could 
 possibly be causing this behavior?
 
 Thanks
 
 
 
 
 
 - Original Message 
 From: Henrik Nordstrom [EMAIL PROTECTED]
 To: BUI18 [EMAIL PROTECTED]
 Cc: squid-users@squid-cache.org
 Sent: Wednesday, October 22, 2008 4:06:33 PM
 Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected
 
 On ons, 2008-10-22 at 14:35 -0700, BUI18 wrote:
 
  Object is initially cached.  Max age in squid.conf is set to 1 min.
  Before 1 min passes, I request the object and Squid returns TCP_HIT.
  After 1 min, I try to request for object again.  Squid returns
  TCP_REFRESH_HIT, which is what I expect.  I leave the entire system
  untouched.  A day or a day and a half later, I ask for the object
  again and Squid returns TCP_REFRESH_MISS/200.
 
 
 TCP_HIT is a local hit on the Squid cache. Origin server was not asked.
 
 TCP_REFRESH_HIT is a cache hit after the origin server was asked if the
 object is still fresh.
 
 TCP_REFREHS_MISS is when the origin server says the object is no longer
 fresh and returns a new copy on the conditional query sent by the cache.
 (same query as in TCP_REFRESH_HIT, different response from the web
 server).
 
  What could possibly cause Squid to refetch the entire object again?
 
 A better question is why your server responds with the entire object on
 a If-Modified-Since type query if it hasn't been modified. It should
 have responded with a 304 response as it did in the TCP_REFRESH_HIT
 case.
 
 Regards
 Henrik
 
 
 
  



  


Re: [squid-users] Objects Release from Cache Earlier Than Expected

2008-10-21 Thread BUI18
The web server is IIS 6.

1)  Would there be any reason why it would return the full object when in fact 
the object has not been modified?
2)  If the min age guarantees freshness of the object, why would Squid actually 
issue and IMS request to the web server in the first place?  As I understand 
it, Squid should only issue and IMS request when objects become STALE.  As 
such, I would have expected Squid to return TCP_HIT instead of TCP_REFRESH_MISS.
3)  My big concern is that the store.log shows that the object was released 
(deleted) from cache well before the min age while there is still and abundant 
amount of disk space is available.

Also, one other question:

When Squid issues and IMS request, which date does it use?  Is it the date/time 
that it retrieved the object or is it the Last Modified date/time of the object 
ascertained by Squid on first retrieval of the object?

Regards
-bui



- Original Message 
From: Henrik Nordstrom [EMAIL PROTECTED]
To: BUI18 [EMAIL PROTECTED]
Cc: squid-users@squid-cache.org
Sent: Tuesday, October 21, 2008 3:50:20 AM
Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected

On mån, 2008-10-20 at 17:45 -0700, BUI18 wrote:
 I not sure what you mean by a newer copy of the same URL?  Can you elaborate 
 on that a bit?

The cache (i.e. Squid) performed a conditional request to the origin web
server, and the web server returned a new 200 OK object with full
content instead of a small 304 Not Modified.

Regards
Henrik






[squid-users] Objects Release from Cache Earlier Than Expected

2008-10-20 Thread BUI18
Hi -

I have been trying to track down an issue with Squid 2.6 STABLE18 and why users 
were getting TCP_REFRESH_MISS instead of TCP_REFRESH_HIT on files that were 
recently cached.  We first noticed that users were getting misses when we 
expected them to receive hits.

I have set the min and max age to be 5 and 7 days respectively.  When I look in 
the store.log file, I do see objects which were known to have been cached today 
(base on time/date stamp in the file name), yet they have status code of 
RELEASE.  

The cache_dir (1 TB) on this system is only 25% full.  The low watermark is set 
at 95% with high at 97%.

Does any one have any ideas on why Squid would appear to purge the object 
earlier than expected?

Thanks in advance.


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


Re: [squid-users] Objects Release from Cache Earlier Than Expected

2008-10-20 Thread BUI18
Hi -

Here's some additional info I noticed from the store.log.

1224524455.351
SWAPOUT 00 0003A6CB 7377CBD1A7584A5D7C7FD06B5B827595  200 1224524431
1224522501-1 video/jpeg 1337100/1337100 GET 
http://ftp.mydomain.com/myserver/websites/data/MyVideoFile1020130441180.vid
1224544851.517
RELEASE 00 0003A6CB CD5B96F66CC94483D586D7E67A76A94C  200 1224524431
1224522501-1 video/jpeg 1337100/-279 GET 
http://ftp.mydomain.com/myserver/websites/data/MyVideoFile1020130441180.vid
1224544862.563
SWAPOUT 00 0003CA26 7377CBD1A7584A5D7C7FD06B5B827595  200 1224544840
1224522501-1 video/jpeg 1337100/1337100 GET 
http://ftp.mydomain.com/myserver/websites/data/MyVideoFile1020130441180.vid

Trace breaks down as follows.
1)  File was first pre-fetched by wget program.
2)  File was released by cache.
3)  File was re-fetch when a user tried to access it.

The
thing that stuck out is that in the RELEASE line of the log, the
real-length is a negative number (-279).  What does this mean exactly?


- Original Message 
From: BUI18 [EMAIL PROTECTED]
To: squid-users@squid-cache.org
Sent: Monday, October 20, 2008 4:02:52 PM
Subject: [squid-users] Objects Release from Cache Earlier Than Expected

Hi -

I have been trying to track down an issue with Squid 2.6 STABLE18 and why users 
were getting TCP_REFRESH_MISS instead of TCP_REFRESH_HIT on files that were 
recently cached.  We first noticed that users were getting misses when we 
expected them to receive hits.

I have set the min and max age to be 5 and 7 days respectively.  When I look in 
the store.log file, I do see objects which were known to have been cached today 
(base on time/date stamp in the file name), yet they have status code of 
RELEASE.  

The cache_dir (1 TB) on this system is only 25% full.  The low watermark is set 
at 95% with high at 97%.

Does any one have any ideas on why Squid would appear to purge the object 
earlier than expected?

Thanks in advance.


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


Re: [squid-users] Objects Release from Cache Earlier Than Expected

2008-10-20 Thread BUI18
I not sure what you mean by a newer copy of the same URL?  Can you elaborate on 
that a bit?

As far as I know, the aspx pages displays a list of buttons for each video 
file.  When the user clicks on the button, it references the URL.

I've seen it where the user click the link and gets a TCP_REFRESH_HIT, but if I 
come back a day later (well within my min/max settings), I get a 
TCP_REFRESH_MISS.

I also previously posted additional info from the store.log.  Which shows the 
object being cached and then released after a short time.





- Original Message 
From: Henrik Nordstrom [EMAIL PROTECTED]
To: BUI18 [EMAIL PROTECTED]
Cc: squid-users@squid-cache.org
Sent: Monday, October 20, 2008 4:55:41 PM
Subject: Re: [squid-users] Objects Release from Cache Earlier Than Expected

On mån, 2008-10-20 at 16:02 -0700, BUI18 wrote:
 Hi -
 
 I have been trying to track down an issue with Squid 2.6 STABLE18 and
 why users were getting TCP_REFRESH_MISS instead of TCP_REFRESH_HIT on
 files that were recently cached.  We first noticed that users were
 getting misses when we expected them to receive hits.

TCP_REFRESH_MISS is a cache validation which indicated the object has
been updated on the origin server.

 I have set the min and max age to be 5 and 7 days respectively.  When
 I look in the store.log file, I do see objects which were known to
 have been cached today (base on time/date stamp in the file name), yet
 they have status code of RELEASE.  

And you are sure it wasn't simply replaced with a newer copy of the same
URL?

Regards
Henrik


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com


Re: [squid-users] Object becomes STALE: refresh_pattern min and max

2008-10-16 Thread BUI18
Sorry it took a while to get back.  Not sure how to interpre X-Cache and 
X-Cache-Lookup.

Here's the header info from Fiddler:

Request Header

GET /server1/websites/data/folder/myvideofile.vid HTTP/1.1

Client
Accept: */*
Transport
Host: ftp.mydomain.com
Proxy-Connection: Keep-Alive

Response Header

HTTP/1.0 200 OK
Content-Length: 1775372
Content-Type: video/jpeg
Last-Modified: Tue, 02 Sep 2008 23:57:25 GMT
Accept-Ranges: none
ETag: 8020b2a557dc91:3ecc
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Thu, 16 Oct 2008 22:52:49 GMT
X-Cache: MISS from squid.mydomain.com
X-Cache-Lookup: HIT from squid.mydomain.com:3128
Via: 1.0 squid.mydomain.com:3128 (squid/2.6.STABLE14)
Proxy-Connection: keep-alive





- Original Message 
From: Amos Jeffries [EMAIL PROTECTED]
To: BUI18 [EMAIL PROTECTED]
Cc: squid-users@squid-cache.org; Itzcak Pechtalt [EMAIL PROTECTED]
Sent: Wednesday, September 24, 2008 6:17:22 AM
Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max

BUI18 wrote:
 Hi - Thanks for responding.  URL for video file never changes.
 

What release of Squid?

Did you check the Expires header properly from the transfer rather than 
from the (apparently untrustworthy) info in the store log?


 I did some more checking in the Squid logs and this is what I noticed:
 
 File Properties of video file (Pacific Daylight Time (PDT))
 
 Created On: Monday, September 22, 2008, 8:59:35 AM
 
 Modified On: Monday, September 22, 2008, 8:59:35 AM
 
 Accessed On: Today, September 24, 2008, 3:53:12 AM
 
 ***
 Wget Grabs File (Time in India Standard Time (IST))
 
 --04:38:35--  http://ftp.mydomain.com/websites/data/myvideofile.vid
  = `/WGET/Temp/myvideofile.vid'
 04:38:54 (93.91 KB/s) - `/WGET/Temp/myvideofile.vid' saved [1791244/1791244]
 
 The access.log confirms initial pre-fetch by wget.
 
 1222124934.241  18968 192.168.200.4 TCP_MISS/200 1791684 GET 
 http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 
 video/jpeg
 
 UTC = Mon, 22 Sep 2008 23:08:54 GMT
 
 The store.log shows a write from memory to disk:
 
 1222124934.241 SWAPOUT 00 00057B65 1E18E35BDC9307C6BC3FBEFD5B4120A3  200 
 1222124765 1222099175-1 video/jpeg 1791244/1791244 GET 
 http://ftp.mydomain.com/websites/data/myvideofile.vid
 
 UTC = Mon, 22 Sep 2008 23:08:54 GMT
 
 ***
 
 Then Store.log shows release or removal from cache:
 
 153725.068 RELEASE 00 00057B65 605FAC36E93B0CDE81902BBC6C5EC71A  200 
 1222124765 1222099175-1 video/jpeg 1791244/-279 GET 
 http://ftp.mydomain.com/websites/data/myvideofile.vid
 
 UTC = Wed, 24 Sep 2008 10:55:25 GMT
 
 Notice the -1 for expiration header (I do not set one on the object).  My min 
 age is 5 days so I'm not sure why the object would be released from cache in 
 less than 2 days.
 
 If the object was released from cache, when the user tried to access file, 
 Squid reports TCP_REFRESH_MISS, which to me means that it was found in cache 
 but when it sends a If-Modified-Since request, it thinks that the file has 
 been modified (which it was not as seen by the lastmod date indicated in the 
 store.log below.
 
 ***
 
 User accessed file (access.log):
 
 153742.005  17275 192.168.200.52 TCP_REFRESH_MISS/200 1791688 GET 
 http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 
 video/jpeg
 
 UTC = Wed, 24 Sep 2008 10:55:42 GMT
 
 Then store.log shows a write to disk
 
 153742.005 SWAPOUT 00 00088336 1E18E35BDC9307C6BC3FBEFD5B4120A3  200 
 153575 1222099175-1 video/jpeg 
 1791244/1791244 GET http://ftp.mydomain.com/websites/data/myvideofile.vid
 
 UTC = Wed, 24 Sep 2008 10:55:42 GMT
 datehdr: Wed, 24 Sep 2008 10:55:55 GMT
 lastmod: Mon, 22 Sep 2008 15:59:35 GMT
 
 Anyone with ideas on why this behavior occurs?
 
 thanks
 
 
 
 
 
 - Original Message 
 From: Itzcak Pechtalt [EMAIL PROTECTED]
 To: Squid Users squid-users@squid-cache.org
 Sent: Wednesday, September 24, 2008 4:35:59 AM
 Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max
 
 On Wed, Sep 24, 2008 at 1:39 PM, BUI18 [EMAIL PROTECTED] wrote:
 Hi -

 I have squid box with tons of disk for the cache_dir
 (hundreds of GB).  I use wget to perform some pre-fetching of large
 video files.  I've set the min and max age to 5 days and 7 days (in
 minutes).  And although I have plenty of disk space available, I still
 receive TCP_REFRESH_MISS for files that had been pre-fetched and later
 accessed the same day.  Does anyone know why Squid would consider it as
 STALE?  I thought that by setting the min value for refresh_pattern for
 the video file would guarantee freshness.  Not only does the cache
 consider it STALE, it then goes and pre-fetches a new copy even though
 I know

[squid-users] Object becomes STALE: refresh_pattern min and max

2008-09-24 Thread BUI18
Hi -

I have squid box with tons of disk for the cache_dir
(hundreds of GB).  I use wget to perform some pre-fetching of large
video files.  I've set the min and max age to 5 days and 7 days (in
minutes).  And although I have plenty of disk space available, I still
receive TCP_REFRESH_MISS for files that had been pre-fetched and later
accessed the same day.  Does anyone know why Squid would consider it as
STALE?  I thought that by setting the min value for refresh_pattern for
the video file would guarantee freshness.  Not only does the cache
consider it STALE, it then goes and pre-fetches a new copy even though
I know that the video file has not changed.  Any help would be greatly
appreciated.  Thanks.


  


[squid-users] How to Cache aspx Pages?

2008-09-24 Thread BUI18
Hi -  I need to cache aspx pages.  I have read through the squid faq for 
caching dynamic content and have tried the following
configuration, however, it does not seem to cache aspx pages with or without 
query
strings.

A sample link may look like this (this is not a working link) -- 
http://www.domain.com/junk.aspx?id=12345

My current relevant configurations from squid.conf:

# Allows dynamic content with query strings in the path
acl junkname urlpath_regex -i \?
cache allow junkname
hierarchy_stoplist cgi-bin ?
acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY
refresh_pattern -i \.aspx$  144090% 2880 ignore-reload

I get status of TCP_MISS in cache.log when I request the page.  Am I missing 
something?  Do I need an entry for http_access as well?

Does anyone see a problem with the above configuration parameters?  Does any 
one know how to properly cache aspx pages?

Thanks in advance.



  


Re: [squid-users] Object becomes STALE: refresh_pattern min and max

2008-09-24 Thread BUI18
Hi - Thanks for responding.  URL for video file never changes.

I did some more checking in the Squid logs and this is what I noticed:

File Properties of video file (Pacific Daylight Time (PDT))

Created On: Monday, September 22, 2008, 8:59:35 AM

Modified On: Monday, September 22, 2008, 8:59:35 AM

Accessed On: Today, September 24, 2008, 3:53:12 AM

***
Wget Grabs File (Time in India Standard Time (IST))

--04:38:35--  http://ftp.mydomain.com/websites/data/myvideofile.vid
 = `/WGET/Temp/myvideofile.vid'
04:38:54 (93.91 KB/s) - `/WGET/Temp/myvideofile.vid' saved [1791244/1791244]

The access.log confirms initial pre-fetch by wget.

1222124934.241  18968 192.168.200.4 TCP_MISS/200 1791684 GET 
http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 
video/jpeg

UTC = Mon, 22 Sep 2008 23:08:54 GMT

The store.log shows a write from memory to disk:

1222124934.241 SWAPOUT 00 00057B65 1E18E35BDC9307C6BC3FBEFD5B4120A3  200 
1222124765 1222099175-1 video/jpeg 1791244/1791244 GET 
http://ftp.mydomain.com/websites/data/myvideofile.vid

UTC = Mon, 22 Sep 2008 23:08:54 GMT

***

Then Store.log shows release or removal from cache:

153725.068 RELEASE 00 00057B65 605FAC36E93B0CDE81902BBC6C5EC71A  200 
1222124765 1222099175-1 video/jpeg 1791244/-279 GET 
http://ftp.mydomain.com/websites/data/myvideofile.vid

UTC = Wed, 24 Sep 2008 10:55:25 GMT

Notice the -1 for expiration header (I do not set one on the object).  My min 
age is 5 days so I'm not sure why the object would be released from cache in 
less than 2 days.

If the object was released from cache, when the user tried to access file, 
Squid reports TCP_REFRESH_MISS, which to me means that it was found in cache 
but when it sends a If-Modified-Since request, it thinks that the file has been 
modified (which it was not as seen by the lastmod date indicated in the 
store.log below.

***

User accessed file (access.log):

153742.005  17275 192.168.200.52 TCP_REFRESH_MISS/200 1791688 GET 
http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 
video/jpeg

UTC = Wed, 24 Sep 2008 10:55:42 GMT

Then store.log shows a write to disk

153742.005 SWAPOUT 00 00088336 1E18E35BDC9307C6BC3FBEFD5B4120A3  200 
153575 1222099175-1 video/jpeg 
1791244/1791244 GET http://ftp.mydomain.com/websites/data/myvideofile.vid

UTC = Wed, 24 Sep 2008 10:55:42 GMT
datehdr: Wed, 24 Sep 2008 10:55:55 GMT
lastmod: Mon, 22 Sep 2008 15:59:35 GMT

Anyone with ideas on why this behavior occurs?

thanks





- Original Message 
From: Itzcak Pechtalt [EMAIL PROTECTED]
To: Squid Users squid-users@squid-cache.org
Sent: Wednesday, September 24, 2008 4:35:59 AM
Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max

On Wed, Sep 24, 2008 at 1:39 PM, BUI18 [EMAIL PROTECTED] wrote:
 Hi -

 I have squid box with tons of disk for the cache_dir
 (hundreds of GB).  I use wget to perform some pre-fetching of large
 video files.  I've set the min and max age to 5 days and 7 days (in
 minutes).  And although I have plenty of disk space available, I still
 receive TCP_REFRESH_MISS for files that had been pre-fetched and later
 accessed the same day.  Does anyone know why Squid would consider it as
 STALE?  I thought that by setting the min value for refresh_pattern for
 the video file would guarantee freshness.  Not only does the cache
 consider it STALE, it then goes and pre-fetches a new copy even though
 I know that the video file has not changed.  Any help would be greatly
 appreciated.  Thanks.





Hi,
Check if the video URL changes from request to request. In YouTube
video even if the main URL is the same, there is request ID in URL who
changes per request.

Itzcak



  


Re: [squid-users] Object becomes STALE: refresh_pattern min and max

2008-09-24 Thread BUI18
Hi -

I went through your same thinking as you described below.

I checked the Expires header from the server and we do not set one.  I checked 
via Fiddler web debug tool.  I also verified with the dev guys here regarding 
no Expires header.  I have set the min and max via refresh_pattern because of 
the absence of the Expires header thinking that Squid would keep it FRESH.

I recently posted the details of the Squid logs, but will re-post it here so 
that the thread will follow properly (it may help others when search).

Hoping someone could explain the behavior I see below.


File Properties of video file (Pacific Daylight Time (PDT))

Created On: Monday, September 22, 2008, 8:59:35 AM

Modified On: Monday, September 22, 2008, 8:59:35 AM

Accessed On: Today, September 24, 2008, 3:53:12 AM

***
Wget Grabs File (Time in India Standard Time (IST))

--04:38:35--  http://ftp.mydomain.com/websites/data/myvideofile.vid
= `/WGET/Temp/myvideofile.vid'
04:38:54 (93.91 KB/s) - `/WGET/Temp/myvideofile.vid' saved [1791244/1791244]

The access.log confirms initial pre-fetch by wget.

1222124934.241  18968 192.168.200.4 TCP_MISS/200 1791684 GET 
http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 
video/jpeg

UTC = Mon, 22 Sep 2008 23:08:54 GMT

The store.log shows a write from memory to disk:

1222124934.241 SWAPOUT 00 00057B65 1E18E35BDC9307C6BC3FBEFD5B4120A3  200 
1222124765 1222099175-1 video/jpeg 1791244/1791244 GET 
http://ftp.mydomain.com/websites/data/myvideofile.vid

UTC = Mon, 22 Sep 2008 23:08:54 GMT

***

Then Store.log shows release or removal from cache:

153725.068 RELEASE 00 00057B65 605FAC36E93B0CDE81902BBC6C5EC71A  200 
1222124765 1222099175-1 video/jpeg 1791244/-279 GET 
http://ftp.mydomain.com/websites/data/myvideofile.vid

UTC = Wed, 24 Sep 2008 10:55:25 GMT

Notice
the -1 for expiration header (I do not set one on the object).  My min
age is 5 days so I'm not sure why the object would be released from
cache in less than 2 days.

If the object was released from
cache, when the user tried to access file, Squid reports
TCP_REFRESH_MISS, which to me means that it was found in cache but when
it sends a If-Modified-Since request, it thinks that the file has been
modified (which it was not as seen by the lastmod date indicated in the
store.log below.

***

User accessed file (access.log):

153742.005  17275 192.168.200.52 TCP_REFRESH_MISS/200 1791688 GET 
http://ftp.mydomain.com/websites/data/myvideofile.vid - DIRECT/69.43.136.41 
video/jpeg

UTC = Wed, 24 Sep 2008 10:55:42 GMT

Then store.log shows a write to disk

153742.005 SWAPOUT 00 00088336 1E18E35BDC9307C6BC3FBEFD5B4120A3  200 
153575 1222099175-1 video/jpeg 
1791244/1791244 GET http://ftp.mydomain.com/websites/data/myvideofile.vid

UTC = Wed, 24 Sep 2008 10:55:42 GMT
datehdr: Wed, 24 Sep 2008 10:55:55 GMT
lastmod: Mon, 22 Sep 2008 15:59:35 GMT






- Original Message 
From: Michael Alger [EMAIL PROTECTED]
To: squid-users@squid-cache.org
Sent: Wednesday, September 24, 2008 4:49:38 AM
Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max

On Wed, Sep 24, 2008 at 03:39:16AM -0700, BUI18 wrote:
 I have squid box with tons of disk for the cache_dir
 (hundreds of GB).  I use wget to perform some pre-fetching of large
 video files.  I've set the min and max age to 5 days and 7 days (in
 minutes).  And although I have plenty of disk space available, I still
 receive TCP_REFRESH_MISS for files that had been pre-fetched and later
 accessed the same day.  Does anyone know why Squid would consider it as
 STALE?  I thought that by setting the min value for refresh_pattern for
 the video file would guarantee freshness.  Not only does the cache
 consider it STALE, it then goes and pre-fetches a new copy even though
 I know that the video file has not changed.  Any help would be greatly
 appreciated.  Thanks.

The fact that it's doing TCP_REFRESH_xxx means squid does have a
cached copy which it considers potentially stale. So it's sending an
If-Modified-Since request to the origin server. The origin is then
either saying yes, it's been modified since you retrieved it --
here's a new one; or it has no idea how to handle IMS and is
sending the whole object regardless.

What Expires: header is the server sending? You can use the -S
switch with wget to show the server response headers when you're
doing your pre-fetch. If they look fine, maybe keep a copy of them
and compare later in the day when squid decides it needs a new one.

I assume you're setting the min and max age via refresh_pattern
lines? Remember that these are only used in the absence of an
Expires header.



  


Re: [squid-users] Object becomes STALE: refresh_pattern min and max

2008-09-24 Thread BUI18
My Squid Version is 2.6/STABLE14

Here's my refresh_pattern from squid.conf

#Suggested default:
refresh_pattern ^ftp:   144020% 10080
refresh_pattern ^gopher:14400%  1440

#The following line will ignore a client no-cache header
#refresh_pattern -i \.vid$   0   90% 2880 ignore-reload
refresh_pattern -i \.vid$   7200100%10080 ignore-reload

refresh_pattern .   0   20% 4320

A link to the file looks something like this -- 
http://ftp.mydomain.com/websites/data/myvideofile.vid

I have to set up a station to grab the header but I can tell you that it does 
not seem out of the ordinary.

There is one cache-control:  Pragma: no-cache

I believe I handle this with the ignore-reload options.

Our server is an IIS server running on Windows 2003.

I also ran a test with min and max age of 0 and 1 respectively, and it seems to 
work.  I receive a TCP_REFRESH_HIT, which is what I would have expected as 
these files do not change.

Please let me know if you have any other ideas on how to track down why it 
would release from cache before min age with no Expiration set on the object.

Open to any suggestions.
Thanks




- Original Message 
From: Michael Alger [EMAIL PROTECTED]
To: squid-users@squid-cache.org
Sent: Wednesday, September 24, 2008 8:09:50 AM
Subject: Re: [squid-users] Object becomes STALE: refresh_pattern min and max

On Wed, Sep 24, 2008 at 05:29:52AM -0700, BUI18 wrote:
 I went through your same thinking as you described below.
 
 I checked the Expires header from the server and we do not set
 one.  I checked via Fiddler web debug tool.  I also verified with
 the dev guys here regarding no Expires header.  I have set the min
 and max via refresh_pattern because of the absence of the Expires
 header thinking that Squid would keep it FRESH.
 
 Notice the -1 for expiration header (I do not set one on the
 object).  My min age is 5 days so I'm not sure why the object
 would be released from cache in less than 2 days.
 
 If the object was released from cache, when the user tried to
 access file, Squid reports TCP_REFRESH_MISS, which to me means
 that it was found in cache but when it sends a If-Modified-Since
 request, it thinks that the file has been modified (which it was
 not as seen by the lastmod date indicated in the store.log below.

Interesting that it's caching the file for 2 days. What are the full
headers returned with the object? Any other cache control headers?

Is there any chance you have a conflicting refresh_pattern, so the
freshness rules being applied aren't the ones you're expecting? May
be worth doing some tests with very small max ages to confirm it's
matching the right rule.