Michael created this task. Michael added projects: Wikidata, wmde-wikidata-tech, Observability-Alerting. Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION There are occasional misfires about data missing by the alert for the "DispatchChanges Normal job backlog time (p50, 15min)" metric. Most recently on Tue, 17 Oct 2023 18:15:38 +0000. When looking at the actual data for that metric around that time, it looks complete <https://grafana.wikimedia.org/d/CbmStnlGk/jobqueue-job?var-job=DispatchChanges&orgId=1&var-dc=codfw+prometheus%2Fk8s&from=1697566188860&to=1697566813505&viewPanel=5>. F38595558: image.png <https://phabricator.wikimedia.org/F38595558> Also, there is consistently an email resolving the alert 5 or 10 minutes later. **Notes:** - There is also a task about this alert with respect to the eqiad->codfw datacenter switch: T330770: Investigate DispatchChanges Normal job backlog time (mean avg, 15min) alert post datacenter switch <https://phabricator.wikimedia.org/T330770> - This may or may not be similar in cause or intermediate symptoms to T348831: [WD-ORG] [TECH] Max Lag alerts misfire with a DataSource error <https://phabricator.wikimedia.org/T348831> - In the past, there have also been genuine issues with data missing: T341054: Wikibase DispatchChanges job potentially broken <https://phabricator.wikimedia.org/T341054> TASK DETAIL https://phabricator.wikimedia.org/T349178 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Michael Cc: Aklapper, Michael, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, lmata, ItamarWMDE, Akuckartz, Nandana, colewhite, Lahi, Gq86, herron, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331, fgiunchedi
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org