Hi, I'm having quite a hard time trying to figure out what is going on exactly here, sorry for the length but I prefer to give as much detail as possible.
I'm using varnish on several varnish servers with the same VCL configuration. I have varnish 2.0.4 and 2.0.6, Debian GNU/Linux on amd64 hardware (etch and lenny). All servers happen to have the same weird behaviour regardless of those differences : All is going well, great performances and cache hit ratio, everyone is happy, unless some servers randomly start to make too many backend connexions. After eliminating the usual suspects (cookies, accept-encoding), I started investigating what kind of requests were there, and I realized that somehow most of the requests where hitpass for objects that were previously successfully cached on the same varnish server. But the objects impacted are not necessarily the same on every varnish server, and are not always the same, nor at the same time. After some help from the irc channel (thanks everybody), I started to track some "culprits" with "varnishtop -b -i TxURL", then searching the last "varnishlog -c -o" entry for this object, and finally searching the request id referenced in its HitPass entry. On every case I tracked like this, the scenario was the same, and if I understand things correctly (I'll give commented varnishlog just after), it is like this: - a request comes from the object, it's a miss so we ask the backend for it. It is a pretty simple GET request, WITHOUT "Range" header - the backend, I don't know why, sometimes (and sometimes only) answers with 206 Partial Content, wich I believe is totally wrong and will be tracked down by them, but let's keep this to the varnish side - so because of that 206 answer, a hit for pass cache entry is created, with a certain TTL which in my case seems to be way too long - later on, for each simple (no Range header) request for this object, we hit the hitpass entry and pass to the backend I think I have different problems here, one on the backend obviously, but the others with varnish : - seeing as wrong as this backend answer is, I think it shouldn't be dealt with "normally" by varnish, it would me more interesting to let the next request have its chance to populate the cache with a valid (200) answer, in order to keep the following requests being able to be served from the cache. Is this something that should be changed in varnish, or in the vcl (and how) ? - the TTL used for the HitPass object seems to be the default_ttl (-t 86400 in my setup); I tried to keep it low (0s or 10s) from the vcl config, like this : sub vcl_fetch { if (!obj.cacheable) { set obj.ttl = 0s; return (pass); } /* ... */ } but I doesn't seem to work. I have the feeling that the HitPass TTL is somehow forced to the default_ttl value (which I think is not what the documentation implied). What is going on exactly here and how can this be changed from the vcl ? - I also tried to purge the hit-for-pass culprit with the CLI (purge req.host == the.host && req.url == /the/url), but it didn't work (the object was still there after). Any idea how to purge such objects ? Is this a bug ? Here are the varnishlog outputs. Firt, here is a pass because of the HitPass object, with a correct (200) answer : 18 ReqStart c 89.80.252.34 1069 1643273915 18 RxRequest c GET 18 RxURL c /elements/css/v7/styles_services.css 18 RxProtocol c HTTP/1.1 18 RxHeader c Accept: */* 18 RxHeader c Referer: http://www.sfr.fr/fr/adsl.jsp?version=6004&contener=2 18 RxHeader c Accept-Language: fr 18 RxHeader c Accept-Encoding: gzip, deflate 18 RxHeader c User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) 18 RxHeader c Host: img.s-sfr.fr 18 RxHeader c Connection: Keep-Alive 18 VCL_call c recv lookup 18 VCL_call c hash hash 18 HitPass c 1641180098 18 VCL_call c pass pass 18 Backend c 30 s_sfr s_sfr 18 ObjProtocol c HTTP/1.1 18 ObjStatus c 200 18 ObjResponse c OK 18 ObjHeader c Date: Sat, 27 Feb 2010 14:27:59 GMT 18 ObjHeader c Server: Apache/2.2.13 (Fedora) 18 ObjHeader c Last-Modified: Tue, 09 Feb 2010 09:59:57 GMT 18 ObjHeader c Via: http2 18 ObjHeader c Cache-Control: public 18 ObjHeader c Expires: Sat, 27 Feb 2010 16:20:53 GMT 18 ObjHeader c Vary: Accept-Encoding 18 ObjHeader c Content-Type: text/css;charset=ISO-8859-15 18 TTL c 1643273915 RFC 6773 1267280879 1267280879 1267287653 0 0 18 VCL_call c fetch deliver 18 Length c 3284 18 VCL_call c deliver deliver 18 TxProtocol c HTTP/1.1 18 TxStatus c 200 18 TxResponse c OK 18 TxHeader c Server: Apache/2.2.13 (Fedora) 18 TxHeader c Last-Modified: Tue, 09 Feb 2010 09:59:57 GMT 18 TxHeader c Via: http2 18 TxHeader c Cache-Control: public 18 TxHeader c Expires: Sat, 27 Feb 2010 16:20:53 GMT 18 TxHeader c Vary: Accept-Encoding 18 TxHeader c Content-Type: text/css;charset=ISO-8859-15 18 TxHeader c Content-Length: 3284 18 TxHeader c Date: Sat, 27 Feb 2010 14:27:59 GMT 18 TxHeader c X-Varnish: 1643273915 18 TxHeader c Age: 0 18 TxHeader c Via: 1.1 varnish 18 TxHeader c Connection: keep-alive 18 ReqEnd c 1643273915 1267280879.342154026 1267280879.351441145 0.075042009 0.009259224 0.000027895 => so, the HitPass was created by request id 1641180098. Let's see that request : 59 ReqStart c 77.205.33.140 52401 1641180098 59 RxRequest c GET 59 RxURL c /elements/css/v7/styles_services.css 59 RxProtocol c HTTP/1.1 59 RxHeader c Accept: */* 59 RxHeader c Referer: http://www.sfr.fr/fr/adsl.jsp 59 RxHeader c Accept-Language: fr 59 RxHeader c User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Trident/4.0; GTB6.4; SLCC1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.5.30729; .NET CLR 3.0.30729) 59 RxHeader c Accept-Encoding: gzip, deflate 59 RxHeader c Host: img.s-sfr.fr 59 RxHeader c Connection: Keep-Alive 59 VCL_call c recv lookup 59 VCL_call c hash hash 59 VCL_call c miss fetch 59 Backend c 21 s_sfr s_sfr 59 ObjProtocol c HTTP/1.1 59 ObjStatus c 206 59 ObjResponse c Partial Content 59 ObjHeader c Date: Fri, 26 Feb 2010 22:01:08 GMT 59 ObjHeader c Server: Apache/2.2.13 (Fedora) 59 ObjHeader c Last-Modified: Tue, 09 Feb 2010 09:59:57 GMT 59 ObjHeader c Via: http4 59 ObjHeader c Cache-Control: public 59 ObjHeader c Expires: Fri, 26 Feb 2010 23:08:27 GMT 59 ObjHeader c Content-Type: text/css;charset=ISO-8859-15 59 TTL c 1641180098 RFC 4038 1267221668 1267221668 1267225707 0 0 59 VCL_call c fetch 59 TTL c 1641180098 VCL 0 1267221668 59 VCL_return c pass 59 Length c 2211 59 VCL_call c deliver deliver 59 TxProtocol c HTTP/1.1 59 TxStatus c 206 59 TxResponse c Partial Content 59 TxHeader c Server: Apache/2.2.13 (Fedora) 59 TxHeader c Last-Modified: Tue, 09 Feb 2010 09:59:57 GMT 59 TxHeader c Via: http4 59 TxHeader c Cache-Control: public 59 TxHeader c Expires: Fri, 26 Feb 2010 23:08:27 GMT 59 TxHeader c Content-Type: text/css;charset=ISO-8859-15 59 TxHeader c Content-Length: 2211 59 TxHeader c Date: Fri, 26 Feb 2010 22:01:08 GMT 59 TxHeader c X-Varnish: 1641180098 59 TxHeader c Age: 0 59 TxHeader c Via: 1.1 varnish 59 TxHeader c Connection: keep-alive 59 ReqEnd c 1641180098 1267221668.117854595 1267221668.127019882 2.395729065 0.009134293 0.000030994 => Here is the "wrong" 206 answer. No Range header was in the request so 200 and the full content was expected. Also no Content-Range in the backend answer, that's also very strange. Anyway, its TTL would be 4038 calculated from the HTTP headers (RFC 4038), then it is forced to 0 from VCL (VCL 0), then we go to pass. Now pay attention to the Date: headers from the backend answers (yes, the backend time is correct ;)). The HitPass object was created "Fri, 26 Feb 2010 22:01:08 GMT" with a calculated TTL of 4038 (forced to 0 from VCL but it doesn't work), so it should be purged about 4038s later, but it is still there for a request made "Sat, 27 Feb 2010 14:27:59 GMT", thats much, much more !! That should not be possible, unless default_ttl (-t 86400 in my setup) is used instead of the computed TTL... Many thanks in advance for any explanation or advice on this situation ! Best regards, Tom -- Thomas Parmelan -+- thomas.parme...@free.fr -+- t...@ankh.fr.eu.org _______________________________________________ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc