hoo created this task.
hoo added projects: Wikidata, ORES, Operations, Wikimedia-Incident, Release-Engineering-Team.
Herald added a subscriber: Aklapper.
Herald added a project: Scoring-platform-team.

TASK DESCRIPTION

Starting at around 18:45 on October 26 we had a huge spike of 503s.

F10464980: Screenshot_20171027_123424.png

The perceivable problem on cp1053 were to many open connections to the backend, leaving no room for new requests. As we weren't able to identify what actually caused the connection limit exhaustion, we tried to disable several recently deployed/ potentially troublesome things.

Finally lifting the connection limit from 1k to 10k on cp1053 mitigated the problem, but the underlying issue is most likely still not resolved.

Related SAL entries:

 03:02 bblack: cp1067, cp4026 - backend restarts, mailbox lag
22:47 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: wikidata to wmf.4
22:39 bblack: restarting varnish-backend on cp1053 (mailbox lag from ongoing issues elsewhere?)
21:40 bblack: raising backend max_connections for api.svc.eqiad.wmnet + appservers.svc.eqiad.wmnet from 1K to 10K on cp1053.eqiad.wmnet (current funnel for the bulk of the 503s)
21:32 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Temporary disable remex html (T178632) (duration: 00m 50s)
21:30 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Temporary disable remex html (T178632) (duration: 00m 50s)
21:00 hoo: Fully revert all changes related to T178180
20:59 hoo@tin: Synchronized wmf-config/Wikibase.php: Revert "Add property for RDF mapping of external identifiers for Wikidata" (T178180) (duration: 00m 50s)
20:02 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: UBN! disbale ores for wikidata (T179107) (duration: 00m 50s)
20:00 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: UBN! disbale ores for wikidata (T179107) (duration: 00m 50s)

Changes done during investigation that need to be undone:

  • Move wikidatawiki back to wmf5
  • Re-enable remex html (T178632)
  • Re-enable the RDF mapping of external identifiers on Wikidata (T178180)
  • Re-enable ORES on wikidata (T179107)

Also the train was halted due to this (T174361).


TASK DETAIL
https://phabricator.wikimedia.org/T179156

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo
Cc: Legoktm, tstarling, awight, Ladsgroup, Lydia_Pintscher, ori, BBlack, demon, greg, Aklapper, hoo, Lahi, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, Zppix, Mkdw, Liudvikas, srodlund, Luke081515, Wikidata-bugs, aude, ArielGlenn, faidon, zeljkofilipin, Alchimista, He7d3r, Mbch331, Rxy, Jay8g, fgiunchedi, mmodell
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to