Re: [squid-users] Squid Reverse Proxy (accel) always contacting the server
On 04/01/2012 03:21 AM, Amos Jeffries wrote: Other useful things to know; Generating an ETag label for each unique output helps caches detect unique versions without timestamp calculations. The easy ways to do this are to make ETag a MD5 hash of the body object. Or a hash of the Last-Modified timestamp string if the body is too expensive to locate MD5 for. Or some other property of the resource which is guaranteed to change any time the body changes and not otherwise. As I told you this was my successive step. I implemented ETag, working perfectly fine now by computing the MD5 of the content before formatting it it's the MD5 of the actual Data that I use to obtain the response (which can be Json, XML, Html, ...). But I have issues with Squid not sending me the ETag in If-None-Modified request header, I created another topic here in the list: Squid 3.1 + Accel conf + ETag = ignoring ETag to discuss this. Cache-Control:stale-while-revalidate tells caches to revalidate, but not to block the client response waiting for that validation to finish. Clients will get the old object until a new one or 304 is received back. I can't use this yet because I have an older version of Squid but thank you, I'll use it as soon as my production Linux distribution update Squid. bye
Fwd: Re: [squid-users] Squid Reverse Proxy (accel) always contacting the server
(re-send, sent off-list as a mistake) On 04/01/2012 03:21 AM, Amos Jeffries wrote: revalidation is more of a threshold which gets set on each object. Under the threshold no valdation takes place, above it every request gets validated. BUT ... a 304 response revalutating the object can change the threshold by sending new timestamp and caching headers. Thank you I now managed to do exactly what I need... I still have 2 little issues but I'll open another thread for those :) you've been very helpful You have the two options of max-age or Expires. The thing to remember is to increment the value / threshold forward to the next poitn where you want revalidation to take place. with a max-age N value which you generate dynamically by: calculate current age of object when responding, add 60. with Expires: you simply emit a timestamp of now() + 60 seconds on each response. yes I experimented.. I think 60 seconds is perfect for max-age and I get rid of Expires time, it's overridden by the max-age anyway. I also set up Vary and Last-Modified headers. And added age (always 0) and Date (always now) on my server response. Squid3 is now caching perfectly my RESTfull service (GET) Other useful things to know; Generating an ETag label for each unique output helps caches detect unique versions without timestamp calculations. The easy ways to do this are to make ETag a MD5 hash of the body object. Or a hash of the Last-Modified timestamp string if the body is too expensive to locate MD5 for. Or some other property of the resource which is guaranteed to change any time the body changes and not otherwise. Yeah, that's would be the next step, but it's a little complicated for me to extract something that makes sense as an ETag, when I'll be able I will Cache-Control:stale-while-revalidate tells caches to revalidate, but not to block the client response waiting for that validation to finish. Clients will get the old object until a new one or 304 is received back. that's really interesting but I didn't find anything about it here: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html is it standard? thanks do you, by any chance, know how to tell the cache to return a stale value if the server is not responsive and while waiting it comes back online? this would be wonderful because it would allow me to take down the server for maintenance without having a service interruption. 2) which is the best way to debug why squid3 is deciding to keep a cache entry, contact the server or not? looking at the huge debug log is not very simple maybe some log option to filter it with the cache decisions informations only would help debug_options 22,3 ... or maybe 22,5 if there is not enough at level 3. perfect!!! where can I find a list of sections id and their meaning?
Re: [squid-users] Squid Reverse Proxy (accel) always contacting the server
On 04/02/2012 02:04 AM, Amos Jeffries wrote: yes I experimented.. I think 60 seconds is perfect for max-age and I get rid of Expires time, it's overridden by the max-age anyway. For Squid-3.1+ yes that is true, older HTTP/1.0 software only obeys Expires:. So it is a matter of whether you want to further leverage any old software caches around the 'Net your users might be behind. good to know! I don't need support for old HTTP/1.0 but I'll keep it in mind, thanks that's really interesting but I didn't find anything about it here: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html is it standard? Yes. http://tools.ietf.org/html/rfc5861 NP: Squid-3 is not obeying it properly yet, but other caches around the 'Net do. So its incrementally useful already and when we roll it into Squid the gain will be immediate wherever its used. I wonder why the w3c doesn't list it. thanks! I'll integrate it as soon as possible when you say squid3 do not obey properly to it what do you exactly mean? Cache-Control:stale-if-error=N, also documented in RFC 5861. Squid-3.2 obey this one already. Sorry, no 3.1 support. our squid3 production server is a 3.1 but I'll implement it so that it comes to work when we upgrade it! thanks again, you've been of great help. http://wiki.squid-cache.org/KnowledgeBase/DebugSections perfect! ciao, Daniele
Re: [squid-users] Squid Reverse Proxy (accel) always contacting the server
On 30/03/2012 12:47 p.m., Daniele Segato wrote: Hi, This is what I want to obtain: Environment: * everything on the same machine (Debian GNU\Linux) * server running on tomcat, port 8080 * squid running on port 280 * client can be anywhere, but for now it's on the localhost machine too I want to set up an http cache to my tomcat server to reduce the load on it. And I expect to obtain a result like this: First request 1. 9:00 AM (today) client request GET to http://localhost:280/myservice 2. squid receive the request, nothing in cache, contact my server 3. tomcat reply with a 200, the body and some header: Cache-Control: public, max-age=3600 Last-Modified: //8:00 AM// 4. squid store in cache that result that should be valid until 10:00 AM (today) = 9:00 AM (time of the request) + 3600 seconds (max-age) 5. client receive the response Second request: 1. 9:05 AM (today) client request GET to http://localhost:280/myservice with header If-Modified-Since: //8:00 AM// 2. squid receive the request, see 9:05 AM 10:00 AM -- cache hit 304 3. client receive the response 304 Third request (after 10:00 AM) 1. 10:05 AM (today) client request GET to http://localhost:280/myservicewith header If-Modified-Since: //8:00 AM// 2. squid receive the request, see 10:05 AM 10:00 AM -- time to see if the server has a new version, forward the if-modified-since request to the server 3. suppose the resource is not changed: tomcat reply with a 304 Not Modified, again with headers: Cache-Control: public, max-age=3600 Last-Modified: //8:00 AM// 4. squid store update the cache value to be valid until 11:05 AM (today) = 10:05 AM (time of the request) + 3600 seconds (max-age) 5. client receive the response: 304 Not Modified Instead squid is ALWAYS requiring the resource to the server: $ curl -v -H 'If-Modified-Since: Thu, 29 Mar 2012 22:14:20 GMT' 'http://localhost:280/alfresco/service/catalog/products' * About to connect() to localhost port 280 (#0) * Trying 127.0.0.1... * connected * Connected to localhost (127.0.0.1) port 280 (#0) GET /alfresco/service/catalog/products HTTP/1.1 User-Agent: curl/7.24.0 (x86_64-pc-linux-gnu) libcurl/7.24.0 OpenSSL/1.0.0h zlib/1.2.6 libidn/1.24 libssh2/1.2.8 librtmp/2.3 Host: localhost:280 Accept: */* If-Modified-Since: Thu, 29 Mar 2012 22:14:20 GMT * additional stuff not fine transfer.c:1037: 0 0 * HTTP 1.0, assume close after body HTTP/1.0 304 Not Modified Date: Thu, 29 Mar 2012 23:27:57 GMT Cache-Control: public, max-age=3600 Last-Modified: Thu, 29 Mar 2012 22:14:20 GMT max-age The max-age response directive indicates that the response is to be considered stale after its age is greater than the specified number of seconds. The logic goes like this: Object modified ... 22:14:20 Valid +3600 == fresh until 23:14:50 Current time: 23:27:57 23:14:50 23:27:15 == currently stale. must revalidate. Expires header can be used to set an absolute time for invaldation. max-age is relative to age. Amos
Re: [squid-users] Squid Reverse Proxy (accel) always contacting the server
On 03/31/2012 10:13 AM, Amos Jeffries wrote: On 30/03/2012 12:47 p.m., Daniele Segato wrote: Instead squid is ALWAYS requiring the resource to the server: $ curl -v -H 'If-Modified-Since: Thu, 29 Mar 2012 22:14:20 GMT' 'http://localhost:280/alfresco/service/catalog/products' * About to connect() to localhost port 280 (#0) * Trying 127.0.0.1... * connected * Connected to localhost (127.0.0.1) port 280 (#0) GET /alfresco/service/catalog/products HTTP/1.1 User-Agent: curl/7.24.0 (x86_64-pc-linux-gnu) libcurl/7.24.0 OpenSSL/1.0.0h zlib/1.2.6 libidn/1.24 libssh2/1.2.8 librtmp/2.3 Host: localhost:280 Accept: */* If-Modified-Since: Thu, 29 Mar 2012 22:14:20 GMT * additional stuff not fine transfer.c:1037: 0 0 * HTTP 1.0, assume close after body HTTP/1.0 304 Not Modified Date: Thu, 29 Mar 2012 23:27:57 GMT Cache-Control: public, max-age=3600 Last-Modified: Thu, 29 Mar 2012 22:14:20 GMT max-age The max-age response directive indicates that the response is to be considered stale after its age is greater than the specified number of seconds. The logic goes like this: Object modified ... 22:14:20 Valid +3600 == fresh until 23:14:50 Current time: 23:27:57 23:14:50 23:27:15 == currently stale. must revalidate. Expires header can be used to set an absolute time for invaldation. max-age is relative to age. Hi amos, My content has been lastly modified at 22:14:20. But I did two successive request, one at 23:27:00, one at 23:27:20 the first one: 23:27:00 was a cache miss the second is what you see above. you are saying that max-age is added to last modified date but that doesn't make much sense to me. If the server (parent cache) is returning the content at 23:27:00 saying max-age 3600 I would expect that 3600 start from now. anyway, I thought about this before and I also tried to modify the content, then immediately giving two request to squid. this time, suppose: Object modified ... 00:00:00 Valid +3600 == fresh until 01:01:00 Current time: 00:05:00 01:01:00 00:05:00 == currently fresh. shouldn't bother the server. instead what's actually happening is that squid is doing a request to my server, only header, but it's still doing it. My server, to compute the Last-Modified date has to do all the job of collecting the data, looping to each data element and extract, for each, the last modified date, then compute the last one.. it build a model that is then rendered: it's pretty short anyway since it's gzipped text. So the big work of my server is to collect the data, and my server have to do it both if you do a GET both if you do an HEAD request. I would like squid to revalidate with my server every, say 1 minute, even 10 seconds is ok.. but it shouldn't revalidate every single request it is receiving. I hope I made my point. I wanted to give you an example but now squid is always giving me a TCP_MISS # squid3 -k debug curl -v 'http://localhost:280/alfresco/service/catalog/products'; squid3 -k debug * About to connect() to localhost port 280 (#0) * Trying 127.0.0.1... * connected * Connected to localhost (127.0.0.1) port 280 (#0) GET /alfresco/service/catalog/products HTTP/1.1 User-Agent: curl/7.24.0 (x86_64-pc-linux-gnu) libcurl/7.24.0 OpenSSL/1.0.0h zlib/1.2.6 libidn/1.24 libssh2/1.2.8 librtmp/2.3 Host: localhost:280 Accept: */* * additional stuff not fine transfer.c:1037: 0 0 * HTTP 1.0, assume close after body HTTP/1.0 200 OK Date: Sat, 31 Mar 2012 14:53:51 GMT Content-Language: en_US Cache-Control: public, max-age=3600 Last-Modified: Sat, 31 Mar 2012 14:03:55 + Vary: Accept, Accept-Language Content-Type: application/json;charset=UTF-8 Content-Length: 1668 Server: Jetty(6.1.21) X-Cache: MISS from localhost X-Cache-Lookup: MISS from localhost:280 Via: 1.0 localhost (squid/3.1.19) * HTTP/1.0 connection set to keep alive! Connection: keep-alive in the debug log I see: 2012/03/31 16:53:51.696| getDefaultParent: returning localhost 2012/03/31 16:53:51.696| peerAddFwdServer: adding localhost DEFAULT_PARENT 2012/03/31 16:53:51.696| peerSelectCallback: http://localhost/alfresco/service/catalog/products 2012/03/31 16:53:51.696| fwdStartComplete: http://localhost/alfresco/service/catalog/products 2012/03/31 16:53:51.696| fwdConnectStart: http://localhost/alfresco/service/catalog/products 2012/03/31 16:53:51.696| PconnPool::key(flexformAccel,8080,localhost,[::]is {flexformAccel:8080/localhost} 2012/03/31 16:53:51.696| PconnPool::pop: found myfAccel:8080/localhost(to use) [...] 2012/03/31 16:53:52.159| mem_hdr::write: [249,251) object end 249 2012/03/31 16:53:52.159| storeSwapOut: http://localhost/alfresco/service/catalog/products 2012/03/31 16:53:52.159| storeSwapOut: store_status = STORE_PENDING 2012/03/31 16:53:52.159| store_swapout.cc(190) swapOut: storeSwapOut: mem-inmem_lo = 0 2012/03/31 16:53:52.159| store_swapout.cc(191) swapOut: storeSwapOut: mem-endOffset() = 251 2012/03/31 16:53:52.159|
Re: [squid-users] Squid Reverse Proxy (accel) always contacting the server
On 03/31/2012 05:01 PM, Daniele Segato wrote: On 03/31/2012 10:13 AM, Amos Jeffries wrote: max-age The max-age response directive indicates that the response is to be considered stale after its age is greater than the specified number of seconds. The logic goes like this: Object modified ... 22:14:20 Valid +3600 == fresh until 23:14:50 Current time: 23:27:57 23:14:50 23:27:15 == currently stale. must revalidate. Expires header can be used to set an absolute time for invaldation. max-age is relative to age. Ok I think I now understood you... you are saying that max-age is added to last modified date but that doesn't make much sense to me. If the server (parent cache) is returning the content at 23:27:00 saying max-age 3600 I would expect that 3600 start from now. anyway, I thought about this before and I also tried to modify the content, then immediately giving two request to squid. apparently this was caused by a mistake I did with the server (see below) this time, suppose: Object modified ... 00:00:00 Valid +3600 == fresh until 01:01:00 Current time: 00:05:00 01:01:00 00:05:00 == currently fresh. shouldn't bother the server. instead what's actually happening is that squid is doing a request to my server, only header, but it's still doing it. My server, to compute the Last-Modified date has to do all the job of collecting the data, looping to each data element and extract, for each, the last modified date, then compute the last one.. it build a model that is then rendered: it's pretty short anyway since it's gzipped text. So the big work of my server is to collect the data, and my server have to do it both if you do a GET both if you do an HEAD request. I would like squid to revalidate with my server every, say 1 minute, even 10 seconds is ok.. but it shouldn't revalidate every single request it is receiving. I hope I made my point. this question is still in place :) I wanted to give you an example but now squid is always giving me a TCP_MISS this was my mistake, the Last-Modified date format was wrong from server :) please ignore the debug and everything behind this point in my previous email... Now it's giving cache hits in ram! I think I can summarize my question in this two questions: 1) can I make squid3 update the cache with my server every, say, 1 minute (at most) but use it's cache otherwise without bothering the server (not even for headers)? how? Avoiding to call the server for 1 hour, I think, it's a bit too much: the content can change in the meanwhile and I don't want the user to wait 1 hour for it. On the other part I don't want every single request after that hour is pass to see squid contacting my server to check if the last modified date is changed. 2) which is the best way to debug why squid3 is deciding to keep a cache entry, contact the server or not? looking at the huge debug log is not very simple maybe some log option to filter it with the cache decisions informations only would help Thanks and sorry for the previous message
Re: [squid-users] Squid Reverse Proxy (accel) always contacting the server
On 1/04/2012 3:53 a.m., Daniele Segato wrote: On 03/31/2012 05:01 PM, Daniele Segato wrote: On 03/31/2012 10:13 AM, Amos Jeffries wrote: max-age The max-age response directive indicates that the response is to be considered stale after its age is greater than the specified number of seconds. The logic goes like this: Object modified ... 22:14:20 Valid +3600 == fresh until 23:14:50 Current time: 23:27:57 23:14:50 23:27:15 == currently stale. must revalidate. Expires header can be used to set an absolute time for invaldation. max-age is relative to age. Ok I think I now understood you... you are saying that max-age is added to last modified date but that doesn't make much sense to me. If the server (parent cache) is returning the content at 23:27:00 saying max-age 3600 I would expect that 3600 start from now. anyway, I thought about this before and I also tried to modify the content, then immediately giving two request to squid. apparently this was caused by a mistake I did with the server (see below) this time, suppose: Object modified ... 00:00:00 Valid +3600 == fresh until 01:01:00 Current time: 00:05:00 01:01:00 00:05:00 == currently fresh. shouldn't bother the server. instead what's actually happening is that squid is doing a request to my server, only header, but it's still doing it. My server, to compute the Last-Modified date has to do all the job of collecting the data, looping to each data element and extract, for each, the last modified date, then compute the last one.. it build a model that is then rendered: it's pretty short anyway since it's gzipped text. So the big work of my server is to collect the data, and my server have to do it both if you do a GET both if you do an HEAD request. I would like squid to revalidate with my server every, say 1 minute, even 10 seconds is ok.. but it shouldn't revalidate every single request it is receiving. I hope I made my point. this question is still in place :) revalidation is more of a threshold which gets set on each object. Under the threshold no valdation takes place, above it every request gets validated. BUT ... a 304 response revalutating the object can change the threshold by sending new timestamp and caching headers. I wanted to give you an example but now squid is always giving me a TCP_MISS this was my mistake, the Last-Modified date format was wrong from server :) please ignore the debug and everything behind this point in my previous email... Now it's giving cache hits in ram! I think I can summarize my question in this two questions: 1) can I make squid3 update the cache with my server every, say, 1 minute (at most) but use it's cache otherwise without bothering the server (not even for headers)? how? Avoiding to call the server for 1 hour, I think, it's a bit too much: the content can change in the meanwhile and I don't want the user to wait 1 hour for it. On the other part I don't want every single request after that hour is pass to see squid contacting my server to check if the last modified date is changed. You have the two options of max-age or Expires. The thing to remember is to increment the value / threshold forward to the next poitn where you want revalidation to take place. with a max-age N value which you generate dynamically by: calculate current age of object when responding, add 60. with Expires: you simply emit a timestamp of now() + 60 seconds on each response. Other useful things to know; Generating an ETag label for each unique output helps caches detect unique versions without timestamp calculations. The easy ways to do this are to make ETag a MD5 hash of the body object. Or a hash of the Last-Modified timestamp string if the body is too expensive to locate MD5 for. Or some other property of the resource which is guaranteed to change any time the body changes and not otherwise. Cache-Control:stale-while-revalidate tells caches to revalidate, but not to block the client response waiting for that validation to finish. Clients will get the old object until a new one or 304 is received back. 2) which is the best way to debug why squid3 is deciding to keep a cache entry, contact the server or not? looking at the huge debug log is not very simple maybe some log option to filter it with the cache decisions informations only would help debug_options 22,3 ... or maybe 22,5 if there is not enough at level 3. Amos
[squid-users] Squid Reverse Proxy (accel) always contacting the server
Hi, This is what I want to obtain: Environment: * everything on the same machine (Debian GNU\Linux) * server running on tomcat, port 8080 * squid running on port 280 * client can be anywhere, but for now it's on the localhost machine too I want to set up an http cache to my tomcat server to reduce the load on it. And I expect to obtain a result like this: First request 1. 9:00 AM (today) client request GET to http://localhost:280/myservice 2. squid receive the request, nothing in cache, contact my server 3. tomcat reply with a 200, the body and some header: Cache-Control: public, max-age=3600 Last-Modified: //8:00 AM// 4. squid store in cache that result that should be valid until 10:00 AM (today) = 9:00 AM (time of the request) + 3600 seconds (max-age) 5. client receive the response Second request: 1. 9:05 AM (today) client request GET to http://localhost:280/myservice with header If-Modified-Since: //8:00 AM// 2. squid receive the request, see 9:05 AM 10:00 AM -- cache hit 304 3. client receive the response 304 Third request (after 10:00 AM) 1. 10:05 AM (today) client request GET to http://localhost:280/myservicewith header If-Modified-Since: //8:00 AM// 2. squid receive the request, see 10:05 AM 10:00 AM -- time to see if the server has a new version, forward the if-modified-since request to the server 3. suppose the resource is not changed: tomcat reply with a 304 Not Modified, again with headers: Cache-Control: public, max-age=3600 Last-Modified: //8:00 AM// 4. squid store update the cache value to be valid until 11:05 AM (today) = 10:05 AM (time of the request) + 3600 seconds (max-age) 5. client receive the response: 304 Not Modified Instead squid is ALWAYS requiring the resource to the server: $ curl -v -H 'If-Modified-Since: Thu, 29 Mar 2012 22:14:20 GMT' 'http://localhost:280/alfresco/service/catalog/products' * About to connect() to localhost port 280 (#0) * Trying 127.0.0.1... * connected * Connected to localhost (127.0.0.1) port 280 (#0) GET /alfresco/service/catalog/products HTTP/1.1 User-Agent: curl/7.24.0 (x86_64-pc-linux-gnu) libcurl/7.24.0 OpenSSL/1.0.0h zlib/1.2.6 libidn/1.24 libssh2/1.2.8 librtmp/2.3 Host: localhost:280 Accept: */* If-Modified-Since: Thu, 29 Mar 2012 22:14:20 GMT * additional stuff not fine transfer.c:1037: 0 0 * HTTP 1.0, assume close after body HTTP/1.0 304 Not Modified Date: Thu, 29 Mar 2012 23:27:57 GMT Cache-Control: public, max-age=3600 Last-Modified: Thu, 29 Mar 2012 22:14:20 GMT Content-Type: application/json;charset=UTF-8 Content-Length: 1158 Server: Jetty(6.1.21) Age: 1 X-Cache: HIT from localhost X-Cache-Lookup: HIT from localhost:280 Via: 1.0 localhost (squid/3.1.19) * HTTP/1.0 connection set to keep alive! Connection: keep-alive * Connection #0 to host localhost left intact * Closing connection #0 It say X-Cache: HIT but I actually see the log in my server: the request is reaching it. And since I have to do all the job but the rendering to check if the content is changed this create an heavy load on the server. Here my configuration: the /etc/squid3/squid.conf (only added include on the top) include /etc/squid3/accel8080.conf acl manager proto cache_object acl localhost src 127.0.0.1/32 ::1 acl to_localhost dst 127.0.0.0/8 0.0.0.0/32 ::1 acl SSL_ports port 443 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl CONNECT method CONNECT http_access allow manager localhost http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports http_access allow localhost http_access deny all http_port 3128 coredump_dir /var/spool/squid3 refresh_pattern ^ftp: 144020% 10080 refresh_pattern ^gopher:14400% 1440 refresh_pattern -i (/cgi-bin/|\?) 0 0% 0 refresh_pattern . 0 20% 4320 the included config file (my custom reverse proxy conf): http_port localhost:280 accel ignore-cc cache_peer 127.0.0.1 parent 8080 0 no-query originserver no-digest default name=myAccel refresh_all_ims off acl Safe_ports port 280 http_access deny !Safe_ports acl our_sites dstdomain 127.0.0.1 acl our_sites dstdomain localhost http_access allow our_sites cache_peer_access myAccel allow our_sites cache_peer_access myAccel deny all The startup log (cache.log): 2012/03/30 01:32:22| Starting Squid Cache version 3.1.19 for x86_64-pc-linux-gnu... 2012/03/30 01:32:22| Process ID 23466 2012/03/30 01:32:22| With 65535 file descriptors available 2012/03/30 01:32:22| Initializing IP Cache... 2012/03/30 01:32:22| DNS Socket created at [::], FD 7 2012/03/30 01:32:22| DNS