hoo created this task. hoo added projects: Wikidata, ORES, Operations, Wikimedia-Incident, Release-Engineering-Team. Herald added a subscriber: Aklapper. Herald added a project: Scoring-platform-team. |
TASK DESCRIPTION
Starting at around 18:45 on October 26 we had a huge spike of 503s.
F10464980: Screenshot_20171027_123424.png
The perceivable problem on cp1053 were to many open connections to the backend, leaving no room for new requests. As we weren't able to identify what actually caused the connection limit exhaustion, we tried to disable several recently deployed/ potentially troublesome things.
Finally lifting the connection limit from 1k to 10k on cp1053 mitigated the problem, but the underlying issue is most likely still not resolved.
Related SAL entries:
03:02 bblack: cp1067, cp4026 - backend restarts, mailbox lag 22:47 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: wikidata to wmf.4 22:39 bblack: restarting varnish-backend on cp1053 (mailbox lag from ongoing issues elsewhere?) 21:40 bblack: raising backend max_connections for api.svc.eqiad.wmnet + appservers.svc.eqiad.wmnet from 1K to 10K on cp1053.eqiad.wmnet (current funnel for the bulk of the 503s) 21:32 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Temporary disable remex html (T178632) (duration: 00m 50s) 21:30 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Temporary disable remex html (T178632) (duration: 00m 50s) 21:00 hoo: Fully revert all changes related to T178180 20:59 hoo@tin: Synchronized wmf-config/Wikibase.php: Revert "Add property for RDF mapping of external identifiers for Wikidata" (T178180) (duration: 00m 50s) 20:02 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: UBN! disbale ores for wikidata (T179107) (duration: 00m 50s) 20:00 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: UBN! disbale ores for wikidata (T179107) (duration: 00m 50s)
Changes done during investigation that need to be undone:
- Move wikidatawiki back to wmf5
- Re-enable remex html (T178632)
- Re-enable the RDF mapping of external identifiers on Wikidata (T178180)
- Re-enable ORES on wikidata (T179107)
Also the train was halted due to this (T174361).
TASK DETAIL
EMAIL PREFERENCES
To: hoo
Cc: Legoktm, tstarling, awight, Ladsgroup, Lydia_Pintscher, ori, BBlack, demon, greg, Aklapper, hoo, Lahi, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, Zppix, Mkdw, Liudvikas, srodlund, Luke081515, Wikidata-bugs, aude, ArielGlenn, faidon, zeljkofilipin, Alchimista, He7d3r, Mbch331, Rxy, Jay8g, fgiunchedi, mmodell
Cc: Legoktm, tstarling, awight, Ladsgroup, Lydia_Pintscher, ori, BBlack, demon, greg, Aklapper, hoo, Lahi, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, Zppix, Mkdw, Liudvikas, srodlund, Luke081515, Wikidata-bugs, aude, ArielGlenn, faidon, zeljkofilipin, Alchimista, He7d3r, Mbch331, Rxy, Jay8g, fgiunchedi, mmodell
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs