[Bug 41130] Invalidation of Varnish thumbnail cache sometimes doesn't work

bugzilla-daemon Tue, 29 Jan 2013 19:25:59 -0800

https://bugzilla.wikimedia.org/show_bug.cgi?id=41130


--- Comment #87 from Bawolff (Brian Wolff) <bawolff...@gmail.com> ---
>Current situation - Just testing now. Purging an image seems to result in the
>cache being cleared both in europe and North America. This suggests that
>problem 2 is indeed fixed (yay Leslie and anyone else involved!), which leaves
>us just with problem 1. If that is the case, the work arounds should generally
>work.

After testing a bunch more, it seems the workingness is a bit intermittent.

In one test, I did
https://commons.wikimedia.org/wiki/File:Moscow_metro_map_ru_sb_future.svg?action=purge
. I then looked at the age header. When accessing via caches in north america,
the age header was reset as expected (yay!).

However, when accessing via the europe caching servers [1] there was a rather
unexpected result. Sometimes a varnish server responded (specifically the
response had the header X-Cache: cp1033 hit (1), cp3010 miss (0), cp3009
frontend hit (2) ). When this happened the age header had been recently reset
as expected. This goes beyond my knowledge of WMF's network setup, but I'm
guessing that sometimes cache requests gets forwarded from esams to eqiad(?)
since cp1033 is the cache server that seems to respond from eqiad too (Then
again, I could be totally confused here).

When I got a response from a squid server from esams, the age header was not
reset. (It was 57479 = 15 hours, so nothing was horrendously old, hence htcp
purges were getting there recently, but they aren't at the precise moment of me
writing this).

At the same time, tests I did of purging articles resulted in the cache being
cleared both for the squids at esams, and for the varnish in eqiad, so it seems
like htcp purges are being delievered properly.

The conclusion I draw from this is:
*I really have no idea :s Wild guesses include: Only the upload squid servers
are for some reason not getting the htcp multicast purges, and only sometimes?
The squid servers are overloaded? (However timing seems too coincidental for
that to happen, also I would expect varnish to get overloaded first as it has
the extra overhead of converting htcp -> http purge request). 


It would be nice if ops (or other powers to be) could comment on what they
think the status of multicast htcp purges working is. In various places there
have been comments of "we think this is fixed now", but no one has explicitly
said any of the following:
*"The issue is 100% fixed and we're not worrying about it any longer"
*"We managed to get things sort of working, but there's still some issues, and
we're looking into them"
*"Things are horribly horribly broken, and we're doing the best we can to sort
things out"
*The issue has some other status.

It would be really nice if we could have such a comment about this issue of the
aforementioned nature.


-----

[1] To simulate accessing from europe, I used commands of the form:

 wget -U bawolff -S --header 'host: upload.wikimedia.org'
--no-check-certificate
'https://upload-lb.esams.wikimedia.org/wikipedia/commons/3/3d/Moscow_metro_map_ru_sb_future.svg'

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

[Bug 41130] Invalidation of Varnish thumbnail cache sometimes doesn't work

Reply via email to