BBlack added a comment.
Re: `wikibase.org`, adding it as a non-canonical redirection to catch
confusion from those that manually type URLs is fine, but we should make sure
everyone is clear on which domainname is canonical for this project (I assume
`https://wikiba.se/`) and make sure
BBlack added a comment.
@WMDE-leszek Thanks for looking into it! I believe @CRoslof is who you want
to coordinate with on our end, whose last statement on this topic back in
January was:
In T99531#4878798 <https://phabricator.wikimedia.org/T99531#4878798>,
@CRoslof
BBlack added a comment.
As noted in T155359 <https://phabricator.wikimedia.org/T155359> - WMDE has
moved the hosting of this to some other platform, including the DNS hosting
(and we never had the whois entry). So this task can resolve as Decline I
think (or whatever), but we should
BBlack added a comment.
We'll also need to normalize the incoming `Accept` headers up in the edge
cache layer to avoid pointless vary explosions. Ideally the normalization
should exactly match the application-layer logic that chooses the output
content type. Do you have some pseudo
BBlack added a comment.
Thanks for the data and the patch! We'll dig into the DNS patch next week and get it merged in so we're serving wikiba.se from our DNS as-is (as in, pointing at your existing server IPs). Then we can do handoff of the domain ownership/registration without c
BBlack added a comment.
There's still a couple of things that can be done serially at present, one of which is necessary for the cert issuance later:
Switch the nameservers for wikiba.se to ns[012].wikimedia.org with your current registrar (United Domains). We have to have this to later
BBlack added a comment.
There are different layers of "handing off" DNS management which are being conflated, but to run through them in order:
"Point the A record to the right place" - We don't support this, and can't realistically. We need control of the zone da
BBlack added a comment.
Looking at an internal version of the flavor=dump outputs of an entity,
related observations:
Test request from the inside: `curl -v
'https://www.wikidata.org/wiki/Special:EntityData/Q15223487.ttl?flavor=dump'
--resolve www.wikidata.org:44
BBlack added a comment.
I think it would be better, from my perspective, to really understand the
use-cases better (which I don't). Why do these remote clients need "realtime"
(no staleness) fetches of Q items? What I hear is it sounds like all clients
expect everything
BBlack added a comment.
I think you ran into a temporary blip in some unrelated DNS work (which is
already dealt with), not this bug (502 errors can happen for real infra failure
reasons, too!)
TASK DETAIL
https://phabricator.wikimedia.org/T237319
EMAIL PREFERENCES
https
BBlack added a comment.
The commit is staged above, but we should probably hold until the 16th or so
just in case.
TASK DETAIL
https://phabricator.wikimedia.org/T109072
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: gerritbot
BBlack moved this task to Done on the Traffic workboard.
TASK DETAIL
https://phabricator.wikimedia.org/T107602
WORKBOARD
https://phabricator.wikimedia.org/project/board/1201/
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe, BBlack
Cc: jeremyb
BBlack added a subscriber: BBlack.
BBlack added a comment.
I'm not a fan of this on a few levels:
1. As Andrew said above, why not support this directly in WQDS if you have to
support it at all? as in, let the POSTs come through the rest of the stack
unmolested, and deal with it inside
BBlack added a comment.
In https://phabricator.wikimedia.org/T112151#1739918, @Smalyshev wrote:
> > As Andrew said above, why not support this directly in WQDS if you have to
> > support it at all?
>
>
> Because in Blazegraph, allowing POST means allowing write requests. Th
BBlack added a comment.
POST isn't just theoretically imperfect. Proliferation of POST for what should
be cacheable, readonly, idempotent queries is a serious long-term problem for
us. Primarily it's that we can't cache them in Varnish like we should be able
to (which absorbs
BBlack added a project: Traffic.
TASK DETAIL
https://phabricator.wikimedia.org/T119917
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: BBlack, Aklapper, Smalyshev, jkroll, Wikidata-bugs, Jdouglas, aude,
Deskana, Manybubbles, Mbch331
BBlack added a subscriber: BBlack.
BBlack added a comment.
It would be best to use the header `X-Client-IP` as the notion of the client IP
address for these sorts of purposes. This is intended to resolve trusted XFF,
but has a much shorter list (intended to be improved on), whereas TrustedXFF
BBlack added a project: Wikidata.
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: MZMcBride, Luke081515, Denniss, aaron, faidon, Joe, ori, BBlack, Aklapper,
Wikidata-bugs, aude
BBlack added a subscriber: JanZerebecki.
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: JanZerebecki, MZMcBride, Luke081515, Denniss, aaron, faidon, Joe, ori,
BBlack, Aklapper
BBlack added a comment.
Yeah but the rate increase we're looking at is actually in the htmlCacheUpdate
job insertion rate, regardless of magnification due to
pages-affected-per-update. I'm surprised that we don't have any logs/data as
to the source of those jobs.
TASK
BBlack added a comment.
Well, we have 3 different stages of rate-increase in the insert graph, so it
could well be that we have 3 independent causes to look at here. And it's not
necessarily true that any of them are buggy, but we need to understand what
they're doing and why, bec
BBlack added a comment.
Continuing with some stuff I was saying in IRC the other day. At the "new
normal", we're seeing something in the approximate ballpark of 400/s articles
purged (which is then multiplied commonly for ?action=history and mobile and
ends up more like ~160
BBlack added a comment.
Another data point from the weekend: In one sample I took Saturday morning,
when I sampled for 300s, the top site being purged was srwiki, and something
like 98% of the purges flowing for srwiki were all Talk: pages (well, with
Talk: as %-encoded something in Serbian
BBlack added a comment.
@daniel - Sorry I should have linked this earlier, I made a paste at the time:
https://phabricator.wikimedia.org/P2547 . Note that
`/%D0%A0%D0%B0%D0%B7%D0%B3%D0%BE%D0%B2%D0%BE%D1%80:` is the Serbian srwiki
version of `/Talk:`.
TASK DETAIL
https
BBlack added a comment.
Regardless, the average rate of HTCP these days is normally-flat-ish (a few
scary spikes aside), and is mostly throttled by the jobqueue. The question
still remains: what caused permanent, large bumps in the jobqueue
htmlCacheUpdate insertion rate on ~Dec4, ~Dec11, and
BBlack added a comment.
@ori - yeah that makes sense for the initial bump, and I think there may have
even been a followup to do deferred purges, which may be one of the other
multipliers, but I haven't found it yet (as in, insert an immediate job and
also somehow insert one that fi
BBlack added a comment.
FYI - "cache_status" is not an accurate reflection of anything. I'm not sure
why we really even log it for analytics. The problem is that it only reflects
some varnish state about the first of up to 3 layers of caching, and even then
it does so poorly
BBlack added a comment.
Well then apparently the 10/s edits to all projects number I found before is
complete bunk :)
http://wikipulse.herokuapp.com/ has numbers for wikidata edits that
approximately line up with yours, and then shows Wikipedias at about double
that rate (which might be a
BBlack added a comment.
So, current thinking is that at least one of (maybe two of?) the bumps are from
moving what used to be synchronous HTCP purge during requests to JobRunner jobs
which should be doing the same thing. However, assuming it's that alone (or
even just investigating that
BBlack added a comment.
heh so: https://phabricator.wikimedia.org/T113192 ->
https://gerrit.wikimedia.org/r/#/c/258365/5 is probably the Jan 20 bump.
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferen
BBlack added a comment.
IIRC, the problem we've beat our heads against in past SPARQL-related tickets
is the fact that SPARQL clients are using `POST` method for readonly queries,
due to argument length issues and whatnot. On the surface, that's a
dealbreaker for caching them as `P
BBlack added a comment.
In https://phabricator.wikimedia.org/T125392#1994242, @Milimetric wrote:
> @BBlack - so you think cache_status is not even close to accurate? Do we
> have other accurate measurements of it so we could compare to what extent
> it's misleading? I'm
BBlack added a comment.
In https://phabricator.wikimedia.org/T126730#2034900, @Christopher wrote:
> I may be wrong, but the headers that are returned from a request to the nginx
> server wdqs1002 say that varnish 1.1 is already being used there.
It's varnish 3.0.6 currently (4.
BBlack added a comment.
Bringing this conversation back here from the comments in
https://gerrit.wikimedia.org/r/#/c/228411/
> The short summary about what this does is:
> A read only mirror of Wikidata.org (only the public information) in a
> special database for anyone to ru
BBlack added a comment.
In https://phabricator.wikimedia.org/T107602#1507676, @JanZerebecki wrote:
> If we put it in misc then this would be the first that has another level
> behind misc instead of one named server. I have no preference. You or whoever
> wants to merge it chooses?
BBlack added a subscriber: BBlack.
BBlack added a comment.
Do we actually need an internal service endpoint like `wdqs.svc.eqiad.wmnet`
for this, or are we just doing this as part of a standard construction for
"services with multiple hosts behind public varnish/LVS need another layer of
BBlack added a comment.
Does someone have the specific details here on what cookie name to wipe on
requests to what domainname(s)?
TASK DETAIL
https://phabricator.wikimedia.org/T109038
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc
BBlack added a comment.
As best I can tell from my own testing (but I think someone with deeper insight
into the CORS change for (www|query).wikidata.org and S:UL and such would need
to confirm this sounds sane): I was able to reproduce the issue, and I was able
to apparently perma-fix it for
BBlack added a subscriber: JanZerebecki.
BBlack added a comment.
We think the workaround deployed via https://gerrit.wikimedia.org/r/231556
should fix this up well enough. It worked for @JanZerebecki who still had the
old bad cookie, which the fixup wiped out. Can others confirm?
TASK
BBlack added a comment.
Ah, that makes some logical sense. We should probably stripping duplicate
_User the way we are for duplicate _Token to address the bulk of it
TASK DETAIL
https://phabricator.wikimedia.org/T109038
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings
BBlack added a comment.
Pushed a fix for deleting the duplicate centralauth_User for wikidata.org,
should be in effect globally now.
TASK DETAIL
https://phabricator.wikimedia.org/T109038
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: JanZerebecki
BBlack removed parent tasks: T174932: Recurrent 'mailbox lag' critical alerts and 500s, T175473: Multiple 503 Errors.
TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Aklappe
BBlack added a comment.
Can you explain in more detail? Is the subject of this ticket was was shown as an error in your browser window? I doubt this is related to varnish and/or "mailbox lag".TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wik
BBlack added a comment.
Copying this in from etherpad (this is less awful than 6 hours of raw IRC+SAL logs, but still pretty verbose):
# cache servers work ongoing here, ethtool changes that require short depooled downtimes around short ethernet port outages:
17:49 bblack: ulsfo cp servers
BBlack added a comment.
My gut instinct remains what it was at the end of the log above. I think something in the revert of wikidatawiki to wmf.4 fixed this. And I think given the timing alignment of the Fix sorting of NullResults changes + the initial ORES->wikidata fatals makes those
BBlack added a comment.
Unless anyone objects, I'd like to start with reverting our emergency varnish max_connections changes from https://gerrit.wikimedia.org/r/#/c/386756 . Since the end of the log above, connection counts have returned to normal, which is ~100, which is 1/10th the norm
BBlack added a comment.
In T179156#3715432, @hoo wrote:
I think I found the root cuase now, seems it's actually related to the WikibaseQualityConstraints extension:
Isn't that the same extension referenced in the suspect commits mentioned above?
18:51 ladsgroup@tin: Synchronized
BBlack added a comment.
Updates from the Varnish side of things today (since I've been bad about getting commits/logs tagged onto this ticket):
18:15 - I took over looking at today's outburst on the Varnish side
The current target at the time was cp1053 (after elukey's earlier re
BBlack added a comment.
A while after the above, @hoo started focusing on a different aspect of this we've been somewhat ignoring as more of a side-symptom: that there tend to be a lot of sockets in a strange state on the "target" varnish, to various MW nodes. They look strange on
BBlack added a comment.
Does Echo have any kind of push notification going on, even in light testing yet?TASK DETAILhttps://phabricator.wikimedia.org/T179156EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: ema, Gehel, Smalyshev, TerraCodes, Jay8g
BBlack added a comment.
Now that I'm digging deeper, it seems there are one or more projects in progress built around Push-like things, in particular T113125 . I don't see any evidence that there's been live deploy of them yet, but maybe I'm missing something or other. If w
BBlack added a comment.
In T179156#3718772, @ema wrote:
There's a timeout limiting the total amount of time varnish is allowed to spend on a single request, send_timeout, defaulting to 10 minutes. Unfortunately there's no counter tracking when the timer kicks in, although a debug line
BBlack added a comment.
Trickled-in POST on the client side would be something else. Varnish's timeout_idle, which is set to 5s on our frontends, acts as the limit for receiving all client request headers, but I'm not sure that it has such a limitation that applies to client-sent bodie
BBlack added a comment.
In T179156#3719928, @daniel wrote:
In any case, this would consume front-edge client connections, but wouldn't trigger anything deeper into the stack
That's assuming varnish always caches the entire request, and never "streams" to the backend, even fo
BBlack lowered the priority of this task from "Unbreak Now!" to "High".BBlack added a comment.
Reducing this from UBN->High, because current best-working-theory is this problem is gone so long as we keep the VCL do_stream=false change reverted. Obviously, there's still
BBlack added a comment.
In T179156#3719995, @BBlack wrote:
We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though.
Rewinding a little: this is false, I was just getting confused by terminology. Commons "chunked&quo
BBlack added a comment.
In T179156#3720392, @BBlack wrote:
In T179156#3719995, @BBlack wrote:
We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though.
Rewinding a little: this is false, I was just getting confused by
BBlack lowered the priority of this task from "High" to "Normal".BBlack changed the task status from "Open" to "Stalled".BBlack added a comment.
The timeout changes above will offer some insulation, and as time passes we're not seeing evidence of th
BBlack added a comment.
No, we never made an incident rep on this one, and I don't think it would be fair at this time to implicate ORES as a cause. We can't really say that ORES was directly involved at all (or any of the other services investigated here). Because the cause was so
BBlack added a comment.
It's a pain any direction we slice this, and I'm not fond of adding new canonical domains outside the known set for individual low-traffic projects. We didn't add new domains for a variety of other public-facing efforts (e.g. wdqs, ORES, maps, etc).
We d
BBlack added a comment.
This is probably due to backend timeouts, I would guess? The default
applayer settings being applied to wdqs include `between_bytes_timeout` at only
4s, whereas `first_byte_timeout` is 185s. So if wdqs delayed all output, it
would have 3 minutes or so, but once it
BBlack added a comment.
I did some live experimentation with manual edits to the VCL. It is the
`between_bytes_timeout`, but the situation is complex. The timeout that's
failing is on the varnish frontend fetching from the varnish backend. These
are fixed at 2s, but because this i
BBlack edited projects, added Traffic; removed Varnish.
TASK DETAIL
https://phabricator.wikimedia.org/T127014
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Gehel, BBlack
Cc: gerritbot, BBlack, Gehel, Nikki, Mbch331, Magnus, JanZerebecki, Smalyshev
BBlack added a blocking task: T128813: cache_misc's misc_fetch_large_objects
has issues.
TASK DETAIL
https://phabricator.wikimedia.org/T127014
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Gehel, BBlack
Cc: gerritbot, BBlack, Gehel, Nikki, Mb
BBlack added a comment.
In https://phabricator.wikimedia.org/T121135#1910435, @Atsirlin wrote:
> @Legoktm: Frankly speaking, for a small project like Wikivoyage the cache
brings no obvious benefits, but triggers many serious issues including the
problem of page banners and ToC.
BBlack edited the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude,
Lydia_Pintscher, JanZerebecki, MZMcBride
BBlack added a comment.
F3845100: Screen Shot 2016-04-07 at 7.47.28 PM.png
<https://phabricator.wikimedia.org/F3845100>
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Sma
BBlack closed this task as "Resolved".
BBlack claimed this task.
BBlack added a comment.
We ended up solving this in reverse order. The betwee_bytes_timeout values
are already raised for varnish<->varnish, but we're still working on the
broader stream issues. So this s
BBlack closed blocking task T128813: cache_misc's misc_fetch_large_objects has
issues as "Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T126730
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Krinkle, gerritbot,
BBlack closed blocking task T128813: cache_misc's misc_fetch_large_objects has
issues as "Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T127014
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: gerritbot, BBlac
BBlack created this task.
Herald added a subscriber: Aklapper.
Herald added projects: Operations, Wikidata, Discovery.
TASK DESCRIPTION
Currently wdqs is defined directly in varnish as a pool of randomized backend
hostnames. There should be a real service hostname for the internal service
BBlack edited projects, added Traffic; removed Varnish.
TASK DETAIL
https://phabricator.wikimedia.org/T133490
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Aklapper, Mushroom, Avner, debt, TerraCodes, Gehel, D3r1ck01, FloNight,
Izno
BBlack added a blocked task: T133821: Content purges are unreliable.
TASK DETAIL
https://phabricator.wikimedia.org/T102476
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer
BBlack added a blocked task: T133821: Content purges are unreliable.
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude
BBlack added a comment.
I really don't think it's specifically Wikidata-related either at this point.
Wikidata might be a significant driver of update jobs in general, but the code
changes driving the several large rate increases were probably generic to all
update jobs.
T
BBlack added a comment.
Do you know if some normal traffic is affected, such that we'd know a start
date for a recent change in behavior? Or is it suspected that it was always
this way?
I've been digging through some debugging on this URL (which is an applayer
chunked-respon
BBlack added a comment.
Just jotting down the things I know so far from investigating this morning.
I still don't have a good answer yet.
Based on just the test URL, debugging it extensively at various layers:
1. The response size of that URL is in the ballpark of 32KB uncompr
BBlack added a comment.
Did some further testing on an isolated test machine, using our current
varnish3 package.
- Got 2833-byte test file from uncorrupted (--compressed) output on prod.
This is the exact compressed content bytes emitted by MW/Apache for the broken
test URL
BBlack added a comment.
Thanks for merging in the probably-related tasks. I had somehow missed
really noticing T123159 earlier... So probably digging into gunzip itself
isn't a fruitful path. I'm going to open a separate blocker for this that's
private, so we can keep
BBlack added a blocking task: Restricted Task.
TASK DETAIL
https://phabricator.wikimedia.org/T133866
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack,
akosiaris
BBlack triaged this task as "High" priority.
TASK DETAIL
https://phabricator.wikimedia.org/T133866
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack,
BBlack closed blocking task Restricted Task as "Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T133866
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Ricordisamoa, Trung.anh.dinh, MZMcBride, Anomie, Yurivict,
BBlack added a blocking task: T131501: Convert misc cluster to Varnish 4.
TASK DETAIL
https://phabricator.wikimedia.org/T133490
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Bovlb, Aklapper, Mushroom, Avner, debt, Gehel, D3r1ck01
BBlack added a comment.
So, as it turns out, this is a general varnishd bug in our specific varnishd
build. For purposes of this bug, our varnishd code is essentially 3.0.7 plus a
bunch of ancient forward-ported 'plus' patches related to streaming, and we're
missing
htt
BBlack added a comment.
We now have some understanding of the mechanism of this bug (
https://phabricator.wikimedia.org/T133866#2275985 ). It should go away in the
imminent varnish 4 upgrade of the misc cluster in
https://phabricator.wikimedia.org/T131501.
TASK DETAIL
https
BBlack closed this task as "Resolved".
BBlack claimed this task.
BBlack added a comment.
This works now. There's a significant pause at the start of the transfer
from the user's perspective if it's not a cache hit, because streaming is
disabled as a workaround (so
BBlack closed this task as "Resolved".
BBlack added a comment.
My test cases on cache_text work now, should be resolved!
TASK DETAIL
https://phabricator.wikimedia.org/T133866
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc:
BBlack added a comment.
Assuming there was no transient issue (which became cached) on the wdqs end
of things, then this was likely a transient thing from nginx experiments or the
cache_misc varnish4 upgrade. I banned all wdqs objects from cache_misc and now
your test URL works fine. Can
BBlack added a comment.
Status update: we've been debugging this off and on all day. It's some kind
of bug fallout from cache_misc's upgrade to Varnish 4. It's a very complicated
bug, and we don't really understand it yet. We've made some band-aid fixes to
BBlack added a subscriber: Kghbln.
BBlack merged a task: T135121: stats.wikimedia.org down.
TASK DETAIL
https://phabricator.wikimedia.org/T134989
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Kghbln, ema, Stashbot, Luke081515, matmarex
BBlack added a comment.
In the merged ticket above, it's browser access to status.wm.o, and the
browser's getting a 304 Not Modified and complaining about it (due to missing
character encoding supposedly, but it's entirely likely it's missing everything
and that'
BBlack added a comment.
So we're currently have several experiments in play trying to figure this out:
1. We've got 2x upstream bugfixes applied to our varnishd on cache_misc:
https://github.com/varnishcache/varnish-cache/commit/d828a042b3fc2c2b4f1fea83021f0d5508649e50
BBlack added a comment.
Has anyone been able to reproduce any of the problems in the tickets merged
into here, since roughly the timestamp of the above message?
TASK DETAIL
https://phabricator.wikimedia.org/T134989
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel
BBlack added a comment.
I forgot one of our temporary hacks in the list above in
https://phabricator.wikimedia.org/T134989#2290254:
4. https://gerrit.wikimedia.org/r/#/c/288656/ - we also enabled a critical
small bit here in v4 vcl_hit. I reverted this for now during the varnish3
BBlack added a blocked task: T131501: Convert misc cluster to Varnish 4.
TASK DETAIL
https://phabricator.wikimedia.org/T134989
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Ronarts12, Krenair, Dzahn, GWicke, Smalyshev, Heather, Nirzar
BBlack reopened blocking task T131501: Convert misc cluster to Varnish 4 as
"Open".
TASK DETAIL
https://phabricator.wikimedia.org/T133490
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: MZMcBride, gerritbot, BBlack, Bovlb
BBlack added a comment.
Current State:
- cp3007 and cp1045 are depooled from user traffic, icinga-downtimed for
several days, and have puppet disabled. Please do not re-enable puppet on
these! They also have confd shut down, and are running custom configs to
continue debugging this
BBlack added a comment.
cache_maps cluster switched to the new varnish package today
TASK DETAIL
https://phabricator.wikimedia.org/T134989
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: jeremyb, Ronarts12, Krenair, Dzahn, GWicke
BBlack added a subscriber: GWicke.BBlack added a comment.
@aaron and @GWicke - both patches sound promising, thanks for digging into this topic!TASK DETAILhttps://phabricator.wikimedia.org/T124418EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc
BBlack added a comment.
30 minutes isn't really reasonable, and neither is spamming more purge traffic. If there's a constant risk of the page content breaking without invalidation, how is even 30 minutes acceptable? Doesn't this mean that on average they'll be broken for
1 - 100 of 122 matches
Mail list logo