Based on this conversation, I added a 1h TTL to thumbnail images in vcl_backend_response and that has gotten my hit ratio up to about 55-60% depending on how you calculate it (hit/miss values vs. frontend/backend connections), with up to about 72k objects in memory, up from about 60k max before, though before the upgrades it was more like 600-700k objects.
It's been an hour now and I'm seeing a spike in expired objects and a drop in the number of objects, so I'll probably increase the TTL until I find a sweet spot. I don't think there's any risk since thumbnails don't change often, so even a max of 48h may be reasonable. So I'll do more testing today and see how things go. Thanks! -----Original Message----- From: Florian Tham [mailto:[email protected]] Sent: Tuesday, December 13, 2016 12:13 PM To: Justin Lloyd <[email protected]> Cc: [email protected] Subject: RE: Hit ratio dropped significantly after recent upgrades The log shows that the fetched object is introduced into the cache with both TTL and grace time set to 120s each: -- VCL_call BACKEND_RESPONSE -- TTL VCL 120 120 0 1481637557 -- VCL_return deliver -- Storage malloc s0 It would be interesting to see if a subsequent request to the same URL within less than 4 minutes would yield another miss or not. Regards, Florian Am 13. Dezember 2016 15:27:16 schrieb Justin Lloyd <[email protected]>: > Here’s a typical varnishlog miss for a thumbnail image, appropriately > sanitized. I can provide more if it helps > > https://gist.github.com/Calygos/ca7906da005569046a7031d1fcaa6372 > > > From: Guillaume Quintard [mailto:[email protected]] > Sent: Tuesday, December 13, 2016 12:17 AM > To: Justin Lloyd <[email protected]> > Cc: Dridi Boukelmoune <[email protected]>; [email protected] > Subject: Re: Hit ratio dropped significantly after recent upgrades > > Can you pastebin the req+bereq transactions in varnishlog, related to > such a miss? > > -- > Guillaume Quintard > > On Tue, Dec 13, 2016 at 3:37 AM, Justin Lloyd > <[email protected]<mailto:[email protected]>> wrote: > To follow up on my last email from Friday, at this point the problem > boils down to one thing that I've not been able to determine: Why are > far fewer things being cached now than before the upgrade? > > 1. Cookies don't seem to be the problem. Most appear to be Google > Analytics (as opposed to session), which are being unset by vcl_recv. > > 2. varnishlog/varnishtop shows many thumbnail URLs being missed and > virtually none are requested with a no-cache cache-control header. Is > it possible to use these tools determine if they (or any URLs for that > matter) are being cached following a miss-deliver sequence? There are > about 1.5m thumbnail files totaling around 30 GB, which prior to the > upgrades wasn't an issue, and I don't think it is now since there are > only a few expires and purges per minute and no nukes at all. Varnish > is only using about 2 GB out of the 8 GB allocated to it, where it > used to use all 8 GB and have lots of nukes and far fewer expires, so it's > not a memory constraint. > > Could there be some other resource limitation I'm hitting without > knowing it (nothing in any logs I've seen)? Everything else I could > think of so far seems fine, e.g. open files, threads, tcp connections. > > > -----Original Message----- > From: > [email protected]<mailto:arena. > [email protected]> > [mailto:varnish-misc-bounces+justinl<mailto:varnish-misc-bounces%2Bjus > tinl>[email protected]<mailto:[email protected]>] > On Behalf Of Justin Lloyd > Sent: Friday, December 9, 2016 11:19 AM > To: Dridi Boukelmoune <[email protected]<mailto:[email protected]>> > Cc: > [email protected]<mailto:[email protected]> > Subject: RE: Hit ratio dropped significantly after recent upgrades > > I really am looking at what's happening as well. I have been looking > at both varnishlog and varnishtop and I see a lot of thumbnail image > requests being sent to the backend when there is still plenty of room > for them in the cache, so even though there are a lot of thumbnail > images, I shouldn't see so many backend requests for them. As I > previously mentioned, I give Varnish 8 GB and it used to stay full > (based on RSS usage and looking at nukes vs. expires) but now it > hovers around only about 2 GB used. A related statistics is that there > used to be 600-700k objects in Varnish (based on our graphs of > MAIN.n_object via Collectd's varnish-default-struct.objects-object > metric) but now there are only roughly 40-70k objects in Varnish at > any given time. So it's definitely caching a lot fewer things than it > was before the upgrade, and most of the requested URLs for requests > that have cookies are for a lot of images and thumbnails. Images > shouldn't be cached due to size and overall volume but thumbnails > should, which is why I strip cookies from the thumbnails. These > varnishtop commands break out /images and /images/thumb client > requests, showing IMHO too many regular images being cached and > nowhere near enough > thumbnails: > > # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/" and not ReqURL ~ > "/images/thumb"' > > 349.47 VCL_call HASH > 349.47 VCL_call RECV > 349.47 VCL_call DELIVER > 207.22 VCL_call HIT > 116.40 VCL_call MISS > 116.30 VCL_call PASS > > # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/thumb"' > > 1859.60 VCL_call HASH > 1859.60 VCL_call RECV > 1859.60 VCL_call DELIVER > 1424.83 VCL_call MISS > 422.84 VCL_call HIT > 218.82 VCL_call PASS > > I'm still poking around trying to correlate caching of other types of > URLs based on whether or not the requests have cookies, if > Cache-Control gets returned, etc. but I just wanted to reply with this > info. I do appreciate the responses I'm getting! :) > > > -----Original Message----- > From: Dridi Boukelmoune [mailto:[email protected]<mailto:[email protected]>] > Sent: Friday, December 9, 2016 10:11 AM > To: Justin Lloyd <[email protected]<mailto:[email protected]>> > Cc: Dag Haavi Finstad > <[email protected]<mailto:[email protected]>>; > [email protected]<mailto:[email protected]> > Subject: Re: Hit ratio dropped significantly after recent upgrades > >> To reiterate on a point in another of my responses in this thread, I >> think it may be something about MediaWiki thumbnail images not being >> cached properly despite our current VCL in that regard not having >> changed from how it worked prior to the upgrade during which time we >> were seeing a very high >> (86%-ish) hit ratio from the same formula. > > To reiterate on a point I made on a couple occasions, it's time to > give varnishlog a spin. Too much focus on VCL, and not enough on what's > happening. > > Dridi > _______________________________________________ > varnish-misc mailing list > [email protected]<mailto:[email protected]> > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > > _______________________________________________ > varnish-misc mailing list > [email protected]<mailto:[email protected]> > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > > > > ---------- > _______________________________________________ > varnish-misc mailing list > [email protected] > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc _______________________________________________ varnish-misc mailing list [email protected] https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
