[Wikidata-bugs] [Maniphest] T290332: WDQS overloaded in codfw

2021-09-03 Thread Maintenance_bot
Maintenance_bot added a project: Wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T290332

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Maintenance_bot
Cc: Aklapper, Gehel, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T290332: WDQS overloaded in codfw

2021-09-03 Thread Gehel
Gehel closed this task as a duplicate of T290330: 502 Bad Gateway on WDQS on 
ulsfo.

TASK DETAIL
  https://phabricator.wikimedia.org/T290332

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Aklapper, Gehel, MPhamWMF, CBogen, Namenlos314, Gq86, 
Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T290332: WDQS overloaded in codfw

2021-09-03 Thread Gehel
Gehel created this task.
Gehel added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  At around 9AM UTC today (Sep 3) we started experiencing stability issues with 
WDQS, localized (at least at the moment) to a single, of two, datacenter. 
Unfortunately, we haven't been able to pinpoint the issue as of now. We suspect 
that someone is running a query that affects Blazegraph - that happened a few 
times in the past. Unfortunately, our usual tactics did not help us to find 
which one.
  
  We are working on identifying the issue, but it's clear that this could in a 
few hours bring the service down, so we are working on a quick workaround. 
Since we observed the issue is only causing actual service failures after ~2h 
after restart, for now we are going to introduce a procedure that will restart 
servers randomly, so that uptime for each will be at max around 1h. Only one 
server should be restarted at any given time. This will cause some queries to 
be killed, when each of the servers is restarted, but the alternative is worse.

TASK DETAIL
  https://phabricator.wikimedia.org/T290332

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Aklapper, Gehel, MPhamWMF, CBogen, Namenlos314, Gq86, 
Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org