[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2019-04-25 Thread BBlack
BBlack added a comment. Re: `wikibase.org`, adding it as a non-canonical redirection to catch confusion from those that manually type URLs is fine, but we should make sure everyone is clear on which domainname is canonical for this project (I assume `https://wikiba.se/`) and make sure

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2019-04-25 Thread BBlack
BBlack added a comment. @WMDE-leszek Thanks for looking into it! I believe @CRoslof is who you want to coordinate with on our end, whose last statement on this topic back in January was: In T99531#4878798 <https://phabricator.wikimedia.org/T99531#4878798>, @CRoslof

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2019-08-14 Thread BBlack
BBlack added a comment. As noted in T155359 <https://phabricator.wikimedia.org/T155359> - WMDE has moved the hosting of this to some other platform, including the DNS hosting (and we never had the whois entry). So this task can resolve as Decline I think (or whatever), but we should

[Wikidata-bugs] [Maniphest] [Commented On] T232006: LDF service does not Vary responses by Accept, sending incorrect cached responses to clients

2019-09-18 Thread BBlack
BBlack added a comment. We'll also need to normalize the incoming `Accept` headers up in the edge cache layer to avoid pointless vary explosions. Ideally the normalization should exactly match the application-layer logic that chooses the output content type. Do you have some pseudo

[Wikidata-bugs] [Maniphest] [Updated] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2018-11-21 Thread BBlack
BBlack added a comment. Thanks for the data and the patch! We'll dig into the DNS patch next week and get it merged in so we're serving wikiba.se from our DNS as-is (as in, pointing at your existing server IPs). Then we can do handoff of the domain ownership/registration without c

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2018-12-13 Thread BBlack
BBlack added a comment. There's still a couple of things that can be done serially at present, one of which is necessary for the cert issuance later: Switch the nameservers for wikiba.se to ns[012].wikimedia.org with your current registrar (United Domains). We have to have this to later

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2019-02-20 Thread BBlack
BBlack added a comment. There are different layers of "handing off" DNS management which are being conflated, but to run through them in order: "Point the A record to the right place" - We don't support this, and can't realistically. We need control of the zone da

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-08 Thread BBlack
BBlack added a comment. Looking at an internal version of the flavor=dump outputs of an entity, related observations: Test request from the inside: `curl -v 'https://www.wikidata.org/wiki/Special:EntityData/Q15223487.ttl?flavor=dump' --resolve www.wikidata.org:44

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-12 Thread BBlack
BBlack added a comment. I think it would be better, from my perspective, to really understand the use-cases better (which I don't). Why do these remote clients need "realtime" (no staleness) fetches of Q items? What I hear is it sounds like all clients expect everything

[Wikidata-bugs] [Maniphest] [Commented On] T237319: 502 errors on ATS/8.0.5

2019-11-26 Thread BBlack
BBlack added a comment. I think you ran into a temporary blip in some unrelated DNS work (which is already dealt with), not this bug (502 errors can happen for real infra failure reasons, too!) TASK DETAIL https://phabricator.wikimedia.org/T237319 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] [Commented On] T109072: [Task] Revert https://gerrit.wikimedia.org/r/#/c/231556/3 on 2015-09-14

2015-09-15 Thread BBlack
BBlack added a comment. The commit is staged above, but we should probably hold until the 16th or so just in case. TASK DETAIL https://phabricator.wikimedia.org/T109072 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: gerritbot

[Wikidata-bugs] [Maniphest] [Changed Project Column] T107602: Set up a public interface to the wikidata query service

2015-09-22 Thread BBlack
BBlack moved this task to Done on the Traffic workboard. TASK DETAIL https://phabricator.wikimedia.org/T107602 WORKBOARD https://phabricator.wikimedia.org/project/board/1201/ EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Joe, BBlack Cc: jeremyb

[Wikidata-bugs] [Maniphest] [Commented On] T112151: Support POST for SPARQL query endpoint

2015-10-20 Thread BBlack
BBlack added a subscriber: BBlack. BBlack added a comment. I'm not a fan of this on a few levels: 1. As Andrew said above, why not support this directly in WQDS if you have to support it at all? as in, let the POSTs come through the rest of the stack unmolested, and deal with it inside

[Wikidata-bugs] [Maniphest] [Commented On] T112151: Support POST for SPARQL query endpoint

2015-10-21 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T112151#1739918, @Smalyshev wrote: > > As Andrew said above, why not support this directly in WQDS if you have to > > support it at all? > > > Because in Blazegraph, allowing POST means allowing write requests. Th

[Wikidata-bugs] [Maniphest] [Commented On] T112151: Support POST for SPARQL query endpoint

2015-11-19 Thread BBlack
BBlack added a comment. POST isn't just theoretically imperfect. Proliferation of POST for what should be cacheable, readonly, idempotent queries is a serious long-term problem for us. Primarily it's that we can't cache them in Varnish like we should be able to (which absorbs

[Wikidata-bugs] [Maniphest] [Updated] T119917: Set up backend per-IP limits on varnish for WDQS

2015-12-01 Thread BBlack
BBlack added a project: Traffic. TASK DETAIL https://phabricator.wikimedia.org/T119917 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: BBlack, Aklapper, Smalyshev, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331

[Wikidata-bugs] [Maniphest] [Commented On] T119917: Set up backend per-IP limits on varnish for WDQS

2015-12-01 Thread BBlack
BBlack added a subscriber: BBlack. BBlack added a comment. It would be best to use the header `X-Client-IP` as the notion of the client IP address for these sorts of purposes. This is intended to resolve trusted XFF, but has a much shorter list (intended to be improved on), whereas TrustedXFF

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-25 Thread BBlack
BBlack added a project: Wikidata. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: MZMcBride, Luke081515, Denniss, aaron, faidon, Joe, ori, BBlack, Aklapper, Wikidata-bugs, aude

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-25 Thread BBlack
BBlack added a subscriber: JanZerebecki. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: JanZerebecki, MZMcBride, Luke081515, Denniss, aaron, faidon, Joe, ori, BBlack, Aklapper

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-25 Thread BBlack
BBlack added a comment. Yeah but the rate increase we're looking at is actually in the htmlCacheUpdate job insertion rate, regardless of magnification due to pages-affected-per-update. I'm surprised that we don't have any logs/data as to the source of those jobs. TASK

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-31 Thread BBlack
BBlack added a comment. Well, we have 3 different stages of rate-increase in the insert graph, so it could well be that we have 3 independent causes to look at here. And it's not necessarily true that any of them are buggy, but we need to understand what they're doing and why, bec

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-31 Thread BBlack
BBlack added a comment. Continuing with some stuff I was saying in IRC the other day. At the "new normal", we're seeing something in the approximate ballpark of 400/s articles purged (which is then multiplied commonly for ?action=history and mobile and ends up more like ~160

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. Another data point from the weekend: In one sample I took Saturday morning, when I sampled for 300s, the top site being purged was srwiki, and something like 98% of the purges flowing for srwiki were all Talk: pages (well, with Talk: as %-encoded something in Serbian

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. @daniel - Sorry I should have linked this earlier, I made a paste at the time: https://phabricator.wikimedia.org/P2547 . Note that `/%D0%A0%D0%B0%D0%B7%D0%B3%D0%BE%D0%B2%D0%BE%D1%80:` is the Serbian srwiki version of `/Talk:`. TASK DETAIL https

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. Regardless, the average rate of HTCP these days is normally-flat-ish (a few scary spikes aside), and is mostly throttled by the jobqueue. The question still remains: what caused permanent, large bumps in the jobqueue htmlCacheUpdate insertion rate on ~Dec4, ~Dec11, and

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. @ori - yeah that makes sense for the initial bump, and I think there may have even been a followup to do deferred purges, which may be one of the other multipliers, but I haven't found it yet (as in, insert an immediate job and also somehow insert one that fi

[Wikidata-bugs] [Maniphest] [Commented On] T125392: [Task] figure out the ratio of page views by logged-in vs. logged-out users

2016-02-03 Thread BBlack
BBlack added a comment. FYI - "cache_status" is not an accurate reflection of anything. I'm not sure why we really even log it for analytics. The problem is that it only reflects some varnish state about the first of up to 3 layers of caching, and even then it does so poorly

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-03 Thread BBlack
BBlack added a comment. Well then apparently the 10/s edits to all projects number I found before is complete bunk :) http://wikipulse.herokuapp.com/ has numbers for wikidata edits that approximately line up with yours, and then shows Wikipedias at about double that rate (which might be a

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-03 Thread BBlack
BBlack added a comment. So, current thinking is that at least one of (maybe two of?) the bumps are from moving what used to be synchronous HTCP purge during requests to JobRunner jobs which should be doing the same thing. However, assuming it's that alone (or even just investigating that

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-03 Thread BBlack
BBlack added a comment. heh so: https://phabricator.wikimedia.org/T113192 -> https://gerrit.wikimedia.org/r/#/c/258365/5 is probably the Jan 20 bump. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferen

[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries

2016-02-15 Thread BBlack
BBlack added a comment. IIRC, the problem we've beat our heads against in past SPARQL-related tickets is the fact that SPARQL clients are using `POST` method for readonly queries, due to argument length issues and whatnot. On the surface, that's a dealbreaker for caching them as `P

[Wikidata-bugs] [Maniphest] [Commented On] T125392: [Task] figure out the ratio of page views by logged-in vs. logged-out users

2016-02-16 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T125392#1994242, @Milimetric wrote: > @BBlack - so you think cache_status is not even close to accurate? Do we > have other accurate measurements of it so we could compare to what extent > it's misleading? I'm

[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries

2016-02-17 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T126730#2034900, @Christopher wrote: > I may be wrong, but the headers that are returned from a request to the nginx > server wdqs1002 say that varnish 1.1 is already being used there. It's varnish 3.0.6 currently (4.

[Wikidata-bugs] [Maniphest] [Updated] T107602: Set up a public interface to the wikidata query service

2015-08-04 Thread BBlack
BBlack added a comment. Bringing this conversation back here from the comments in https://gerrit.wikimedia.org/r/#/c/228411/ > The short summary about what this does is: > A read only mirror of Wikidata.org (only the public information) in a > special database for anyone to ru

[Wikidata-bugs] [Maniphest] [Commented On] T107602: Set up a public interface to the wikidata query service

2015-08-04 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T107602#1507676, @JanZerebecki wrote: > If we put it in misc then this would be the first that has another level > behind misc instead of one named server. I have no preference. You or whoever > wants to merge it chooses?

[Wikidata-bugs] [Maniphest] [Commented On] T107601: Assign an LVS service to the wikidata query service

2015-08-04 Thread BBlack
BBlack added a subscriber: BBlack. BBlack added a comment. Do we actually need an internal service endpoint like `wdqs.svc.eqiad.wmnet` for this, or are we just doing this as part of a standard construction for "services with multiple hosts behind public varnish/LVS need another layer of

[Wikidata-bugs] [Maniphest] [Commented On] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-14 Thread BBlack
BBlack added a comment. Does someone have the specific details here on what cookie name to wipe on requests to what domainname(s)? TASK DETAIL https://phabricator.wikimedia.org/T109038 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc

[Wikidata-bugs] [Maniphest] [Commented On] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-14 Thread BBlack
BBlack added a comment. As best I can tell from my own testing (but I think someone with deeper insight into the CORS change for (www|query).wikidata.org and S:UL and such would need to confirm this sounds sane): I was able to reproduce the issue, and I was able to apparently perma-fix it for

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-14 Thread BBlack
BBlack added a subscriber: JanZerebecki. BBlack added a comment. We think the workaround deployed via https://gerrit.wikimedia.org/r/231556 should fix this up well enough. It worked for @JanZerebecki who still had the old bad cookie, which the fixup wiped out. Can others confirm? TASK

[Wikidata-bugs] [Maniphest] [Commented On] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-21 Thread BBlack
BBlack added a comment. Ah, that makes some logical sense. We should probably stripping duplicate _User the way we are for duplicate _Token to address the bulk of it TASK DETAIL https://phabricator.wikimedia.org/T109038 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] [Commented On] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-21 Thread BBlack
BBlack added a comment. Pushed a fix for deleting the duplicate centralauth_User for wikidata.org, should be in effect globally now. TASK DETAIL https://phabricator.wikimedia.org/T109038 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: JanZerebecki

[Wikidata-bugs] [Maniphest] [Updated] T175588: Server overloaded .. can't save (only remove or cancel)

2017-09-11 Thread BBlack
BBlack removed parent tasks: T174932: Recurrent 'mailbox lag' critical alerts and 500s, T175473: Multiple 503 Errors. TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Aklappe

[Wikidata-bugs] [Maniphest] [Commented On] T175588: Server overloaded .. can't save (only remove or cancel)

2017-09-11 Thread BBlack
BBlack added a comment. Can you explain in more detail? Is the subject of this ticket was was shown as an error in your browser window? I doubt this is related to varnish and/or "mailbox lag".TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wik

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. Copying this in from etherpad (this is less awful than 6 hours of raw IRC+SAL logs, but still pretty verbose): # cache servers work ongoing here, ethtool changes that require short depooled downtimes around short ethernet port outages: 17:49 bblack: ulsfo cp servers

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. My gut instinct remains what it was at the end of the log above. I think something in the revert of wikidatawiki to wmf.4 fixed this. And I think given the timing alignment of the Fix sorting of NullResults changes + the initial ORES->wikidata fatals makes those

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. Unless anyone objects, I'd like to start with reverting our emergency varnish max_connections changes from https://gerrit.wikimedia.org/r/#/c/386756 . Since the end of the log above, connection counts have returned to normal, which is ~100, which is 1/10th the norm

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. In T179156#3715432, @hoo wrote: I think I found the root cuase now, seems it's actually related to the WikibaseQualityConstraints extension: Isn't that the same extension referenced in the suspect commits mentioned above? 18:51 ladsgroup@tin: Synchronized

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-28 Thread BBlack
BBlack added a comment. Updates from the Varnish side of things today (since I've been bad about getting commits/logs tagged onto this ticket): 18:15 - I took over looking at today's outburst on the Varnish side The current target at the time was cp1053 (after elukey's earlier re

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-28 Thread BBlack
BBlack added a comment. A while after the above, @hoo started focusing on a different aspect of this we've been somewhat ignoring as more of a side-symptom: that there tend to be a lot of sockets in a strange state on the "target" varnish, to various MW nodes. They look strange on

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-29 Thread BBlack
BBlack added a comment. Does Echo have any kind of push notification going on, even in light testing yet?TASK DETAILhttps://phabricator.wikimedia.org/T179156EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: ema, Gehel, Smalyshev, TerraCodes, Jay8g

[Wikidata-bugs] [Maniphest] [Updated] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-29 Thread BBlack
BBlack added a comment. Now that I'm digging deeper, it seems there are one or more projects in progress built around Push-like things, in particular T113125 . I don't see any evidence that there's been live deploy of them yet, but maybe I'm missing something or other. If w

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3718772, @ema wrote: There's a timeout limiting the total amount of time varnish is allowed to spend on a single request, send_timeout, defaulting to 10 minutes. Unfortunately there's no counter tracking when the timer kicks in, although a debug line

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. Trickled-in POST on the client side would be something else. Varnish's timeout_idle, which is set to 5s on our frontends, acts as the limit for receiving all client request headers, but I'm not sure that it has such a limitation that applies to client-sent bodie

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3719928, @daniel wrote: In any case, this would consume front-edge client connections, but wouldn't trigger anything deeper into the stack That's assuming varnish always caches the entire request, and never "streams" to the backend, even fo

[Wikidata-bugs] [Maniphest] [Lowered Priority] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack lowered the priority of this task from "Unbreak Now!" to "High".BBlack added a comment. Reducing this from UBN->High, because current best-working-theory is this problem is gone so long as we keep the VCL do_stream=false change reverted. Obviously, there's still

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3719995, @BBlack wrote: We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though. Rewinding a little: this is false, I was just getting confused by terminology. Commons "chunked&quo

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3720392, @BBlack wrote: In T179156#3719995, @BBlack wrote: We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though. Rewinding a little: this is false, I was just getting confused by

[Wikidata-bugs] [Maniphest] [Changed Status] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-11-06 Thread BBlack
BBlack lowered the priority of this task from "High" to "Normal".BBlack changed the task status from "Open" to "Stalled".BBlack added a comment. The timeout changes above will offer some insulation, and as time passes we're not seeing evidence of th

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-11-22 Thread BBlack
BBlack added a comment. No, we never made an incident rep on this one, and I don't think it would be fair at this time to implicate ORES as a cause. We can't really say that ORES was directly involved at all (or any of the other services investigated here). Because the cause was so

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia misc-cluster

2017-12-11 Thread BBlack
BBlack added a comment. It's a pain any direction we slice this, and I'm not fond of adding new canonical domains outside the known set for individual low-traffic projects. We didn't add new domains for a variety of other public-facing efforts (e.g. wdqs, ORES, maps, etc). We d

[Wikidata-bugs] [Maniphest] [Updated] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack added a comment. This is probably due to backend timeouts, I would guess? The default applayer settings being applied to wdqs include `between_bytes_timeout` at only 4s, whereas `first_byte_timeout` is 185s. So if wdqs delayed all output, it would have 3 minutes or so, but once it

[Wikidata-bugs] [Maniphest] [Commented On] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack added a comment. I did some live experimentation with manual edits to the VCL. It is the `between_bytes_timeout`, but the situation is complex. The timeout that's failing is on the varnish frontend fetching from the varnish backend. These are fixed at 2s, but because this i

[Wikidata-bugs] [Maniphest] [Updated] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack edited projects, added Traffic; removed Varnish. TASK DETAIL https://phabricator.wikimedia.org/T127014 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, BBlack Cc: gerritbot, BBlack, Gehel, Nikki, Mbch331, Magnus, JanZerebecki, Smalyshev

[Wikidata-bugs] [Maniphest] [Updated] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack added a blocking task: T128813: cache_misc's misc_fetch_large_objects has issues. TASK DETAIL https://phabricator.wikimedia.org/T127014 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, BBlack Cc: gerritbot, BBlack, Gehel, Nikki, Mb

[Wikidata-bugs] [Maniphest] [Commented On] T121135: Banners fail to show up occassionally on Russian Wikivoyage

2016-04-05 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T121135#1910435, @Atsirlin wrote: > @Legoktm: Frankly speaking, for a small project like Wikivoyage the cache brings no obvious benefits, but triggers many serious issues including the problem of page banners and ToC.

[Wikidata-bugs] [Maniphest] [Edited] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-04-07 Thread BBlack
BBlack edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude, Lydia_Pintscher, JanZerebecki, MZMcBride

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-04-07 Thread BBlack
BBlack added a comment. F3845100: Screen Shot 2016-04-07 at 7.47.28 PM.png <https://phabricator.wikimedia.org/F3845100> TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Sma

[Wikidata-bugs] [Maniphest] [Closed] T127014: Empty result on a tree query

2016-04-11 Thread BBlack
BBlack closed this task as "Resolved". BBlack claimed this task. BBlack added a comment. We ended up solving this in reverse order. The betwee_bytes_timeout values are already raised for varnish<->varnish, but we're still working on the broader stream issues. So this s

[Wikidata-bugs] [Maniphest] [Unblock] T126730: [RFC] Caching for results of wikidata Sparql queries

2016-04-12 Thread BBlack
BBlack closed blocking task T128813: cache_misc's misc_fetch_large_objects has issues as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T126730 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Krinkle, gerritbot,

[Wikidata-bugs] [Maniphest] [Unblock] T127014: Empty result on a tree query

2016-04-12 Thread BBlack
BBlack closed blocking task T128813: cache_misc's misc_fetch_large_objects has issues as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T127014 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: gerritbot, BBlac

[Wikidata-bugs] [Maniphest] [Created] T132457: Move wdqs to an LVS service

2016-04-12 Thread BBlack
BBlack created this task. Herald added a subscriber: Aklapper. Herald added projects: Operations, Wikidata, Discovery. TASK DESCRIPTION Currently wdqs is defined directly in varnish as a pool of randomized backend hostnames. There should be a real service hostname for the internal service

[Wikidata-bugs] [Maniphest] [Updated] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-04-24 Thread BBlack
BBlack edited projects, added Traffic; removed Varnish. TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Aklapper, Mushroom, Avner, debt, TerraCodes, Gehel, D3r1ck01, FloNight, Izno

[Wikidata-bugs] [Maniphest] [Updated] T102476: RFC: Requirements for change propagation

2016-04-27 Thread BBlack
BBlack added a blocked task: T133821: Content purges are unreliable. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-04-27 Thread BBlack
BBlack added a blocked task: T133821: Content purges are unreliable. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-05-03 Thread BBlack
BBlack added a comment. I really don't think it's specifically Wikidata-related either at this point. Wikidata might be a significant driver of update jobs in general, but the code changes driving the several large rate increases were probably generic to all update jobs. T

[Wikidata-bugs] [Maniphest] [Commented On] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-03 Thread BBlack
BBlack added a comment. Do you know if some normal traffic is affected, such that we'd know a start date for a recent change in behavior? Or is it suspected that it was always this way? I've been digging through some debugging on this URL (which is an applayer chunked-respon

[Wikidata-bugs] [Maniphest] [Commented On] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-03 Thread BBlack
BBlack added a comment. Just jotting down the things I know so far from investigating this morning. I still don't have a good answer yet. Based on just the test URL, debugging it extensively at various layers: 1. The response size of that URL is in the ballpark of 32KB uncompr

[Wikidata-bugs] [Maniphest] [Commented On] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-05 Thread BBlack
BBlack added a comment. Did some further testing on an isolated test machine, using our current varnish3 package. - Got 2833-byte test file from uncorrupted (--compressed) output on prod. This is the exact compressed content bytes emitted by MW/Apache for the broken test URL

[Wikidata-bugs] [Maniphest] [Updated] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-07 Thread BBlack
BBlack added a comment. Thanks for merging in the probably-related tasks. I had somehow missed really noticing T123159 earlier... So probably digging into gunzip itself isn't a fruitful path. I'm going to open a separate blocker for this that's private, so we can keep

[Wikidata-bugs] [Maniphest] [Updated] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-07 Thread BBlack
BBlack added a blocking task: Restricted Task. TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack, akosiaris

[Wikidata-bugs] [Maniphest] [Triaged] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-07 Thread BBlack
BBlack triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack,

[Wikidata-bugs] [Maniphest] [Unblock] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-09 Thread BBlack
BBlack closed blocking task Restricted Task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Ricordisamoa, Trung.anh.dinh, MZMcBride, Anomie, Yurivict,

[Wikidata-bugs] [Maniphest] [Updated] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-09 Thread BBlack
BBlack added a blocking task: T131501: Convert misc cluster to Varnish 4. TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Bovlb, Aklapper, Mushroom, Avner, debt, Gehel, D3r1ck01

[Wikidata-bugs] [Maniphest] [Updated] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-09 Thread BBlack
BBlack added a comment. So, as it turns out, this is a general varnishd bug in our specific varnishd build. For purposes of this bug, our varnishd code is essentially 3.0.7 plus a bunch of ancient forward-ported 'plus' patches related to streaming, and we're missing htt

[Wikidata-bugs] [Maniphest] [Updated] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-09 Thread BBlack
BBlack added a comment. We now have some understanding of the mechanism of this bug ( https://phabricator.wikimedia.org/T133866#2275985 ). It should go away in the imminent varnish 4 upgrade of the misc cluster in https://phabricator.wikimedia.org/T131501. TASK DETAIL https

[Wikidata-bugs] [Maniphest] [Closed] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-09 Thread BBlack
BBlack closed this task as "Resolved". BBlack claimed this task. BBlack added a comment. This works now. There's a significant pause at the start of the transfer from the user's perspective if it's not a cache hit, because streaming is disabled as a workaround (so

[Wikidata-bugs] [Maniphest] [Closed] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-09 Thread BBlack
BBlack closed this task as "Resolved". BBlack added a comment. My test cases on cache_text work now, should be resolved! TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc:

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-11 Thread BBlack
BBlack added a comment. Assuming there was no transient issue (which became cached) on the wdqs end of things, then this was likely a transient thing from nginx experiments or the cache_misc varnish4 upgrade. I banned all wdqs objects from cache_misc and now your test URL works fine. Can

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-11 Thread BBlack
BBlack added a comment. Status update: we've been debugging this off and on all day. It's some kind of bug fallout from cache_misc's upgrade to Varnish 4. It's a very complicated bug, and we don't really understand it yet. We've made some band-aid fixes to

[Wikidata-bugs] [Maniphest] [Merged] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a subscriber: Kghbln. BBlack merged a task: T135121: stats.wikimedia.org down. TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Kghbln, ema, Stashbot, Luke081515, matmarex

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a comment. In the merged ticket above, it's browser access to status.wm.o, and the browser's getting a 304 Not Modified and complaining about it (due to missing character encoding supposedly, but it's entirely likely it's missing everything and that'

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a comment. So we're currently have several experiments in play trying to figure this out: 1. We've got 2x upstream bugfixes applied to our varnishd on cache_misc: https://github.com/varnishcache/varnish-cache/commit/d828a042b3fc2c2b4f1fea83021f0d5508649e50

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a comment. Has anyone been able to reproduce any of the problems in the tickets merged into here, since roughly the timestamp of the above message? TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-13 Thread BBlack
BBlack added a comment. I forgot one of our temporary hacks in the list above in https://phabricator.wikimedia.org/T134989#2290254: 4. https://gerrit.wikimedia.org/r/#/c/288656/ - we also enabled a critical small bit here in v4 vcl_hit. I reverted this for now during the varnish3

[Wikidata-bugs] [Maniphest] [Updated] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-13 Thread BBlack
BBlack added a blocked task: T131501: Convert misc cluster to Varnish 4. TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Ronarts12, Krenair, Dzahn, GWicke, Smalyshev, Heather, Nirzar

[Wikidata-bugs] [Maniphest] [Block] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-13 Thread BBlack
BBlack reopened blocking task T131501: Convert misc cluster to Varnish 4 as "Open". TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: MZMcBride, gerritbot, BBlack, Bovlb

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-13 Thread BBlack
BBlack added a comment. Current State: - cp3007 and cp1045 are depooled from user traffic, icinga-downtimed for several days, and have puppet disabled. Please do not re-enable puppet on these! They also have confd shut down, and are running custom configs to continue debugging this

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-16 Thread BBlack
BBlack added a comment. cache_maps cluster switched to the new varnish package today TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: jeremyb, Ronarts12, Krenair, Dzahn, GWicke

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-06-17 Thread BBlack
BBlack added a subscriber: GWicke.BBlack added a comment. @aaron and @GWicke - both patches sound promising, thanks for digging into this topic!TASK DETAILhttps://phabricator.wikimedia.org/T124418EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc

[Wikidata-bugs] [Maniphest] [Commented On] T142944: Performance and caching considerations for article placeholders accesses

2016-08-16 Thread BBlack
BBlack added a comment. 30 minutes isn't really reasonable, and neither is spamming more purge traffic. If there's a constant risk of the page content breaking without invalidation, how is even 30 minutes acceptable? Doesn't this mean that on average they'll be broken for

  1   2   >