[Wikidata-bugs] [Maniphest] T348831: [WD-ORG] [TECH] Max Lag alerts misfire with a DataSource error

2023-12-01 Thread fgiunchedi
fgiunchedi added a comment. FWIW I agree with testing the different boundaries, especially as you pointed out the alert is lax in terms of "reactivity" TASK DETAIL https://phabricator.wikimedia.org/T348831 EMAIL PREFERENCES https://phabricator.wikimedia.org/sett

[Wikidata-bugs] [Maniphest] T348831: [WD-ORG] [TECH] Max Lag alerts misfire with a DataSource error

2023-11-23 Thread fgiunchedi
fgiunchedi added a comment. For most cases I think alerting on "no data" and "values are all null" is sensible, in other words you expect to have data returned by the query at all times. In this case I can't quite figure out why the alert went "no data"

[Wikidata-bugs] [Maniphest] T350255: Repeated Wikidata Grafana alerts due to "failed to query data"

2023-11-01 Thread fgiunchedi
fgiunchedi added a comment. Thank you for reaching out @Lucas_Werkmeister_WMDE ! Yes indeed known issue, we (o11y) recommend turning off notifications for datasource errors (full rationale in https://phabricator.wikimedia.org/T347221#9264101) and the instructions being at https

[Wikidata-bugs] [Maniphest] T281267: various weekly and daily dumps run from systemd timers are broken

2023-06-30 Thread fgiunchedi
fgiunchedi added a comment. In T281267#8954763 <https://phabricator.wikimedia.org/T281267#8954763>, @ArielGlenn wrote: > @fgiunchedi I notice that in some cases phab tasks are autocreated when systemd units fail. Is that true for systemd jobs on snapshot hosts? Could we g

[Wikidata-bugs] [Maniphest] T332953: Migrate PipelineLib repos to GitLab

2023-04-12 Thread fgiunchedi
fgiunchedi updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T332953 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: fgiunchedi, WMDE-leszek, leila, fkaelin, ItamarWMDE, elukey, KartikMistry, santhosh

[Wikidata-bugs] [Maniphest] T294014: Invalid wikidata daily metrics received

2022-12-05 Thread fgiunchedi
fgiunchedi removed a project: Observability-Metrics. fgiunchedi added a comment. Not a problem AFAICS, removing o11y TASK DETAIL https://phabricator.wikimedia.org/T294014 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Addshore

[Wikidata-bugs] [Maniphest] T316031: Clean up the rdf-streaming-updater-codfw container from thanos-swift.

2022-09-05 Thread fgiunchedi
fgiunchedi added a comment. Thank you for following up, I think the culprit is the fact that the S3 <https://phabricator.wikimedia.org/S3> compat API stores chunks of big files in a separate container (suffixed with `+segments`). See also the audit below I ran logged into

[Wikidata-bugs] [Maniphest] T314835: wdqs space usage on thanos-swift

2022-08-24 Thread fgiunchedi
fgiunchedi added a comment. In T314835#8178848 <https://phabricator.wikimedia.org/T314835#8178848>, @dcausse wrote: > Moving forward we will: > > - stop the presto-swift client in favor of an S3 <https://phabricator.wikimedia.org/S3> connector. > - cl

[Wikidata-bugs] [Maniphest] T314835: wdqs space usage on thanos-swift

2022-08-18 Thread fgiunchedi
fgiunchedi closed subtask T314914: Bump memcache connections and swift-proxy limits as Resolved. TASK DETAIL https://phabricator.wikimedia.org/T314835 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: MatthewVernon, gmodena, elukey

[Wikidata-bugs] [Maniphest] T314835: wdqs space usage on thanos-swift

2022-08-10 Thread fgiunchedi
fgiunchedi added a comment. In T314835#8141914 <https://phabricator.wikimedia.org/T314835#8141914>, @fgiunchedi wrote: > Thank you @dcausse for diving deep into this issue and mitigating it! I can confirm that the space has stopped growing at the same rate (i.e. not gro

[Wikidata-bugs] [Maniphest] T314835: wdqs space usage on thanos-swift

2022-08-10 Thread fgiunchedi
fgiunchedi added a comment. Thank you @dcausse for diving deep into this issue and mitigating it! I can confirm that the space has stopped growing at the same rate (i.e. not growing ATM). I can confirm that I've seen the same failures from swift client doing mass deletes, not sure why

[Wikidata-bugs] [Maniphest] T314835: wdqs space usage on thanos-swift

2022-08-09 Thread fgiunchedi
fgiunchedi created this task. fgiunchedi added projects: Wikidata-Query-Service, SRE, SRE-swift-storage. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION It looks like wdqs more than tripled its storage space usage in the span of 10 days (from ~6T to ~21T

[Wikidata-bugs] [Maniphest] T281454: Onboard teams with Prometheus-based alerts to AlertManager

2022-07-11 Thread fgiunchedi
fgiunchedi removed a subtask: T300723: Migrate Traffic Prometheus alerts from Icinga to Alertmanager. TASK DETAIL https://phabricator.wikimedia.org/T281454 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Jdlrobson, fgiunchedi

[Wikidata-bugs] [Maniphest] T281454: Onboard teams with Prometheus-based alerts to AlertManager

2022-07-11 Thread fgiunchedi
fgiunchedi removed a subtask: T294564: Migrate Foundations Prometheus alerts to AlertManager. TASK DETAIL https://phabricator.wikimedia.org/T281454 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Jdlrobson, fgiunchedi, Aklapper

[Wikidata-bugs] [Maniphest] T281454: Onboard teams with Prometheus-based alerts to AlertManager

2022-07-11 Thread fgiunchedi
fgiunchedi removed a subtask: T293399: Migrate the majority of the analytics cluster alerts from Icinga to AlertManager. TASK DETAIL https://phabricator.wikimedia.org/T281454 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Jdlrobson

[Wikidata-bugs] [Maniphest] T281454: Onboard teams with Prometheus-based alerts to AlertManager

2022-07-11 Thread fgiunchedi
fgiunchedi removed a subtask: T289077: Migrate Search team's prometheus-based alerts from Icinga to alert-manager. TASK DETAIL https://phabricator.wikimedia.org/T281454 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Jdlrobson

[Wikidata-bugs] [Maniphest] T281454: Onboard teams with Prometheus-based alerts to AlertManager

2022-07-11 Thread fgiunchedi
fgiunchedi removed a subtask: T285328: Migrate OSM sync alerts from icinga to AlertManager. TASK DETAIL https://phabricator.wikimedia.org/T281454 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Jdlrobson, fgiunchedi, Aklapper

[Wikidata-bugs] [Maniphest] T281454: Onboard teams with Prometheus-based alerts to AlertManager

2022-07-11 Thread fgiunchedi
fgiunchedi closed this task as "Resolved". fgiunchedi claimed this task. fgiunchedi added a comment. Resolving this in favor of parent task TASK DETAIL https://phabricator.wikimedia.org/T281454 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailp

[Wikidata-bugs] [Maniphest] T281454: Onboard teams with Prometheus-based alerts to AlertManager

2022-07-01 Thread fgiunchedi
fgiunchedi edited projects, added SRE Observability (FY2022/2023-Q1); removed SRE Observability (FY2021/2022-Q4). TASK DETAIL https://phabricator.wikimedia.org/T281454 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Jdlrobson

[Wikidata-bugs] [Maniphest] T297494: Port Wikidata dashboard data from Graphite to Prometheus

2022-06-29 Thread fgiunchedi
fgiunchedi added a comment. In T297494#8036682 <https://phabricator.wikimedia.org/T297494#8036682>, @ItamarWMDE wrote: >> My understanding is that you are solving for the former problem (i.e. MW) (?) > > In this particular case, yes, the metrics are collected fro

[Wikidata-bugs] [Maniphest] T297494: Port Wikidata dashboard data from Graphite to Prometheus

2022-06-29 Thread fgiunchedi
fgiunchedi added subscribers: colewhite, Krinkle. fgiunchedi added a comment. In T297494#8022604 <https://phabricator.wikimedia.org/T297494#8022604>, @ItamarWMDE wrote: > Reposting from a Slack discussion, it appears as though statsd is still the preferred way to gather som

[Wikidata-bugs] [Maniphest] T297145: Ask for regular backups of our Wikidata Graphite data

2022-01-18 Thread fgiunchedi
fgiunchedi added a comment. This is complete I believe, we're backing up the `daily` hierarchy now TASK DETAIL https://phabricator.wikimedia.org/T297145 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: fgiunchedi

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-12-13 Thread fgiunchedi
fgiunchedi closed this task as "Resolved". fgiunchedi claimed this task. fgiunchedi added a comment. I'm tentatively resolving the task since all short term mitigations are completed, feel free to reopen if sth is amiss TASK DETAIL https://phabricator.wikimedia.org/T294

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-12-10 Thread fgiunchedi
fgiunchedi added a comment. In T294355#7563057 <https://phabricator.wikimedia.org/T294355#7563057>, @Manuel wrote: > Thank you for the suggestion @fgiunchedi! Do we have an explanation somewhere of how to do this? Sure no problem! My understanding is that thes

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-12-10 Thread fgiunchedi
fgiunchedi added a comment. @Manuel @Lydia_Pintscher going forward I suggest also investing resources to switch to Prometheus as the supported metric system. Graphite is deprecated and in "life support" mode while all producers (essentially mediawiki and related) are being p

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-12-10 Thread fgiunchedi
fgiunchedi added a comment. In T294355#7559074 <https://phabricator.wikimedia.org/T294355#7559074>, @Lucas_Werkmeister_WMDE wrote: > In T294355#7531241 <https://phabricator.wikimedia.org/T294355#7531241>, @fgiunchedi wrote: > >> In T294355#7531236 <https:

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-11-26 Thread fgiunchedi
fgiunchedi added a comment. In T294355#7531236 <https://phabricator.wikimedia.org/T294355#7531236>, @Lucas_Werkmeister_WMDE wrote: > I’m not sure I understand the discussion correctly :) do you still need a list of paths to back up, or does it look like we can back up every

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-11-25 Thread fgiunchedi
fgiunchedi added a comment. In T294355#7528880 <https://phabricator.wikimedia.org/T294355#7528880>, @jcrespo wrote: > One more question, to finally decide if setting up weekly full backups or daily but incremental- do all files mostly change completely, or only a subset

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-11-25 Thread fgiunchedi
fgiunchedi added a comment. In T294355#7527157 <https://phabricator.wikimedia.org/T294355#7527157>, @jcrespo wrote: > number of files are (within reason) a non-blocker for bacula, as files are packaged into volumes. It is true that each file is stored as a mysql record, but th

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-11-24 Thread fgiunchedi
fgiunchedi added a subscriber: jcrespo. fgiunchedi added a comment. In T294355#7527026 <https://phabricator.wikimedia.org/T294355#7527026>, @Lucas_Werkmeister_WMDE wrote: > Sounds like a good idea to me, I can’t judge how much would fit in Bacula. Do you need a list of importan

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-11-23 Thread fgiunchedi
fgiunchedi added a comment. I've sent the incident up for review, what do you think re: my proposal of adding parts of the hierarchy to bacula (if it is feasible in terms of number of files, e.g. `daily` is ~100k files now) TASK DETAIL https://phabricator.wikimedia.org/T294355 EMAIL

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-10-29 Thread fgiunchedi
fgiunchedi added a comment. Draft incident report: https://wikitech.wikimedia.org/wiki/Incident_documentation/2021-10-29_graphite Please feel free to integrate/change as needed. I'll be OOO until the 18th and I'll pick this back up TASK DETAIL https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-10-28 Thread fgiunchedi
fgiunchedi added a comment. Audit completed, what I did is count the number of null data points in the year leading to the graphite2003 reimage (i.e. the first reimage, where the backfill would have first failed) from 2020/10/14 to 2021/10/11 (first column). And the number of nulls after

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-10-28 Thread fgiunchedi
fgiunchedi added a comment. Status update: I'm running a full audit on all ~4M metric files looking for similar cases. The backfill from yesterday completed in the mean time and some metrics were able to be backfilled successfully. I'll be following up with an incident report about

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2021

2021-10-27 Thread fgiunchedi
fgiunchedi added a comment. Status update: the backfill is still ongoing since I lowered the concurrency. The good news is that some metrics are already backfilled, e.g. api backend summary: https://grafana.wikimedia.org/d/2/api-backend-summary?viewPanel=31=1=161723520

[Wikidata-bugs] [Maniphest] T294355: Several Wikidata Grafana boards missing data before October 2022

2021-10-26 Thread fgiunchedi
fgiunchedi added a comment. @Lucas_Werkmeister_WMDE thank you for the report. Yes pretty sure the graphite bullseye migration is related. We backfilled graphite1004 from graphite2003 (which in turn was the first host we reimaged, and backfilled it from graphite1004), I suspect some metric

[Wikidata-bugs] [Maniphest] T294014: Invalid wikidata daily metrics received

2021-10-21 Thread fgiunchedi
fgiunchedi added a project: Observability-Metrics. TASK DETAIL https://phabricator.wikimedia.org/T294014 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Aklapper, fgiunchedi, Invadibot, maantietaja, lmata, Akuckartz, Nandana, Lahi

[Wikidata-bugs] [Maniphest] T294014: Invalid wikidata daily metrics received

2021-10-21 Thread fgiunchedi
fgiunchedi created this task. fgiunchedi added a project: Wikidata. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Similarly to T293329: Invalid wikidata graphite metrics received <https://phabricator.wikimedia.org/T293329> there are `wikidata.daily` metrics

[Wikidata-bugs] [Maniphest] T293329: Invalid wikidata graphite metrics received

2021-10-21 Thread fgiunchedi
fgiunchedi added a comment. @Lucas_Werkmeister_WMDE can confirm the invalid metrics don't show up anymore, thank you! I found others for `wikidata.daily` but will file a separate task for that: carbon-cache@b/listener.log:21/10/2021 03:00:00 :: invalid line received from client

[Wikidata-bugs] [Maniphest] T293329: Invalid wikidata graphite metrics received

2021-10-14 Thread fgiunchedi
fgiunchedi created this task. fgiunchedi added projects: Wikidata, Observability-Metrics. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION I noticed this in carbon logs on graphite1004, looks like some wikidata processes don't send a metric value ==> carbon-cach

[Wikidata-bugs] [Maniphest] T281359: Onboard teams with Grafana alerts to AlertManager

2021-09-28 Thread fgiunchedi
fgiunchedi closed this task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T281359 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: lmata, fgiunchedi Cc: fgiunchedi, Aklapper, Suran38, Biggs657, Invadibot, Lalamarie69, m

[Wikidata-bugs] [Maniphest] T290080: Move wikidata lag checks off Icinga

2021-09-07 Thread fgiunchedi
fgiunchedi added a comment. I agree, this is done TASK DETAIL https://phabricator.wikimedia.org/T290080 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ladsgroup, fgiunchedi Cc: colewhite, Addshore, Ladsgroup, fgiunchedi, Aklapper, the0001

[Wikidata-bugs] [Maniphest] T290080: Move wikidata lag checks off Icinga

2021-09-01 Thread fgiunchedi
fgiunchedi added a comment. Thank you @Addshore and @Ladsgroup ! Much easier to go Grafana for now, I've retitled/repurposed the task and thanks for your help on T240685 <https://phabricator.wikimedia.org/T240685> ! TASK DETAIL https://phabricator.wikimedia.org/T290080 EMAIL PREFE

[Wikidata-bugs] [Maniphest] T290080: Move wikidata lag checks off Icinga

2021-09-01 Thread fgiunchedi
fgiunchedi renamed this task from "Collect wikidata/siteinfo in Prometheus" to "Move wikidata lag checks off Icinga". fgiunchedi updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T290080 EMAIL PREFERENCES https://phabricator.wikimedi

[Wikidata-bugs] [Maniphest] T290080: Collect wikidata/siteinfo in Prometheus

2021-08-31 Thread fgiunchedi
fgiunchedi created this task. fgiunchedi added projects: observability, MediaWiki-General, Wikidata. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION This is a followup in the context of migrating wikidata alerts to AlertManager (T287741 <https://phabricator.wikimedia.

[Wikidata-bugs] [Maniphest] T281359: Onboard teams with Grafana alerts to AlertManager

2021-08-23 Thread fgiunchedi
fgiunchedi updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T281359 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: lmata, fgiunchedi Cc: fgiunchedi, Aklapper, Biggs657, Invadibot, Lalamarie69, MPhamWMF, maantietaja, lmata

[Wikidata-bugs] [Maniphest] T281359: Onboard teams with Grafana alerts to AlertManager

2021-08-12 Thread fgiunchedi
fgiunchedi added a parent task: T288622: All Prometheus based alerts move from Icinga to alert manager exclusively. TASK DETAIL https://phabricator.wikimedia.org/T281359 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: fgiunchedi

[Wikidata-bugs] [Maniphest] T281454: Onboard teams with Prometheus-based alerts to AlertManager

2021-08-12 Thread fgiunchedi
fgiunchedi added a parent task: T288622: All Prometheus based alerts move from Icinga to alert manager exclusively. TASK DETAIL https://phabricator.wikimedia.org/T281454 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Jdlrobson

[Wikidata-bugs] [Maniphest] T287741: Convert wikidata-alerts grafana dashboard to AlertManager

2021-08-02 Thread fgiunchedi
fgiunchedi added a parent task: T281359: Onboard teams with Grafana alerts to AlertManager. TASK DETAIL https://phabricator.wikimedia.org/T287741 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Aklapper, Addshore, Invadibot

[Wikidata-bugs] [Maniphest] T281359: Onboard teams with Grafana alerts to AlertManager

2021-08-02 Thread fgiunchedi
fgiunchedi added a subtask: T287741: Convert wikidata-alerts grafana dashboard to AlertManager. TASK DETAIL https://phabricator.wikimedia.org/T281359 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: fgiunchedi, Aklapper, Biggs657

[Wikidata-bugs] [Maniphest] T281359: Onboard teams with Grafana alerts to AlertManager

2021-08-02 Thread fgiunchedi
fgiunchedi closed subtask T282806: Port traffic/netops grafana alerts to AlertManager as Resolved. TASK DETAIL https://phabricator.wikimedia.org/T281359 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: fgiunchedi, Aklapper, Biggs657

[Wikidata-bugs] [Maniphest] T281359: Onboard teams with Grafana alerts to AlertManager

2021-07-29 Thread fgiunchedi
fgiunchedi updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T281359 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: fgiunchedi, Aklapper, Biggs657, Invadibot, Lalamarie69, MPhamWMF, maantietaja, lmata

[Wikidata-bugs] [Maniphest] T281359: Onboard teams with Grafana alerts to AlertManager

2021-07-28 Thread fgiunchedi
fgiunchedi updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T281359 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: fgiunchedi, Aklapper, Biggs657, Invadibot, Lalamarie69, MPhamWMF, maantietaja, lmata

[Wikidata-bugs] [Maniphest] T281359: Onboard teams with Grafana alerts to AlertManager

2021-07-26 Thread fgiunchedi
fgiunchedi updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T281359 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: fgiunchedi, Aklapper, Invadibot, MPhamWMF, maantietaja, lmata, CBogen, Akuckartz, Nandana

[Wikidata-bugs] [Maniphest] T272128: Fix tracking for query service UI

2021-07-14 Thread fgiunchedi
fgiunchedi added a comment. In T272128#7212524 <https://phabricator.wikimedia.org/T272128#7212524>, @Ladsgroup wrote: > Tomorrow all metrics starting with `wikibase.queryService.ui.app.` should be migrated to `wikibase.queryService.ui.index.app.` I will deploy it around

[Wikidata-bugs] [Maniphest] T262741: "Wikidata API format usage" Grafana dashboard is empty

2021-04-20 Thread fgiunchedi
fgiunchedi added a project: observability. TASK DETAIL https://phabricator.wikimedia.org/T262741 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Lydia_Pintscher, Addshore, abian, Aklapper, Invadibot, maantietaja, lmata, Akuckartz

[Wikidata-bugs] [Maniphest] T274249: Offboard wdqs-admins from legacy pager in Icinga

2021-03-03 Thread fgiunchedi
fgiunchedi closed this task as "Resolved". fgiunchedi claimed this task. fgiunchedi added a comment. Chatted with @gehel and concluded we're ok to offboard him and @RKemper from legacy paging as-is! TASK DETAIL https://phabricator.wikimedia.org/T274249 EMAIL PREFERENC

[Wikidata-bugs] [Maniphest] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2021-03-02 Thread fgiunchedi
fgiunchedi added a comment. Restricted Application added a project: wdwb-tech-focus. random-ish update re: checkpoint storage after a chat with @Zbyszko: the current situation is that we're using thanos-swift cluster for wdqs flink checkpoints. This is meant to be a temporary allocation

[Wikidata-bugs] [Maniphest] T269204: Some wdqs metrics changed when switching to python3

2020-12-02 Thread fgiunchedi
fgiunchedi added a project: observability. TASK DETAIL https://phabricator.wikimedia.org/T269204 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: RKemper, dcausse, Aklapper, lmata, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-22 Thread fgiunchedi
fgiunchedi added a comment. In T246004#6567338 <https://phabricator.wikimedia.org/T246004#6567338>, @Zbyszko wrote: > Thank you all for swift (pun intended) action! haha! the account is setup now, I've written the credentials in your home on `deploy1001` TASK DETAI

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-20 Thread fgiunchedi
fgiunchedi added a comment. In T246004#6560303 <https://phabricator.wikimedia.org/T246004#6560303>, @Ottomata wrote: > I don't know! @fgiunchedi how does one access the cluster? @elukey can check the network VLAN ACLs and update accordingly. The canonical url is https

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-13 Thread fgiunchedi
fgiunchedi added a comment. Ok, thank you for the information. It doesn't seem we have an isolated test environment anyways so even though I'm reluctant we'll have to test on the production swift cluster. A middle ground I suppose would be to create the accounts on the `thanos` swift

[Wikidata-bugs] [Maniphest] T265015: TypeError: $.widget is not a function -- User script error

2020-10-08 Thread fgiunchedi
fgiunchedi removed a project: Wikimedia-Logstash. fgiunchedi added a comment. Removing wikimedia-logstash since it doesn't seem to be a logstash-specific issue, please add back if that's not the case! TASK DETAIL https://phabricator.wikimedia.org/T265015 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T265035: Gadget Error: "Uncaught TypeError: $(...).css(...).draggable is not a function"

2020-10-08 Thread fgiunchedi
fgiunchedi removed a project: Wikimedia-Logstash. fgiunchedi added a comment. Removing wikimedia-logstash since it doesn't seem to be a logstash-specific issue, please add back if that's not the case! TASK DETAIL https://phabricator.wikimedia.org/T265035 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T265037: User Script error: "ReferenceError: wikibase is not defined"

2020-10-08 Thread fgiunchedi
fgiunchedi removed a project: Wikimedia-Logstash. fgiunchedi added a comment. Removing wikimedia-logstash since it doesn't seem to be a logstash-specific issue, please add back if that's not the case! TASK DETAIL https://phabricator.wikimedia.org/T265037 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T265022: Client (Gadget?) Error: "NS_ERROR_FILE_CORRUPTED: "

2020-10-08 Thread fgiunchedi
fgiunchedi removed a project: Wikimedia-Logstash. fgiunchedi added a comment. Removing `wikimedia-logstash` since it doesn't seem to be a logstash-specific issue, please add back if that's not the case! TASK DETAIL https://phabricator.wikimedia.org/T265022 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-06 Thread fgiunchedi
fgiunchedi added a comment. In T246004#6520077 <https://phabricator.wikimedia.org/T246004#6520077>, @Zbyszko wrote: > @fgiunchedi Currently, Flink pipeline resides on the Analytics Hadoop cluster. As for the question whether Flink creates it's containers - I think not, it did

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-09-29 Thread fgiunchedi
fgiunchedi added a comment. In T246004#6501106 <https://phabricator.wikimedia.org/T246004#6501106>, @Zbyszko wrote: > @fgiunchedi We estimate we'd need around 500GB of storage for the streaming updater (not accounting for replicas). Our use case is almost always write only (ch

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-09-24 Thread fgiunchedi
fgiunchedi added a subscriber: JMeybohm. fgiunchedi added a comment. @Zbyszko re: docker and swift. @JMeybohm suggested using https://github.com/swiftstack/docker-swift (and possibly lowering auth token TTLs to make sure renewing expired tokens works as expected) re: monitoring, we have

[Wikidata-bugs] [Maniphest] [Updated] T238540: Delete grafana dashboard, https://grafana.wikimedia.org/d/000000599/wikibase-wb_terms-newitemidformatter

2019-11-19 Thread fgiunchedi
fgiunchedi edited projects, added Traffic, observability; removed Graphite. Restricted Application added a project: Operations. TASK DETAIL https://phabricator.wikimedia.org/T238540 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc

[Wikidata-bugs] [Maniphest] [Commented On] T238540: Delete grafana dashboard, https://grafana.wikimedia.org/d/000000599/wikibase-wb_terms-newitemidformatter

2019-11-19 Thread fgiunchedi
fgiunchedi added a comment. I can confirm that a DELETE of https://grafana.wikimedia.org/api/dashboards/uid/00599 results in a 403, further I don't see the request reaching grafana1001's apache logs. I'm adding #traffic <https://phabricator.wikimedia.org/tag/traffic/> since this

[Wikidata-bugs] [Maniphest] [Updated] T204364: Rate limit wdqs logs

2019-08-19 Thread fgiunchedi
fgiunchedi added a project: observability. TASK DETAIL https://phabricator.wikimedia.org/T204364 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, fgiunchedi Cc: gerritbot, Smalyshev, fgiunchedi, Gehel, Aklapper, Hook696, Daryl-TTMG

[Wikidata-bugs] [Maniphest] [Updated] T136852: Wikibase\Client\Changes\WikiPageUpdater logging is very verbose

2019-08-19 Thread fgiunchedi
fgiunchedi added a project: observability. TASK DETAIL https://phabricator.wikimedia.org/T136852 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bd808, fgiunchedi Cc: gerritbot, hoo, Addshore, Aklapper, bd808, Zppix, Hook696, Daryl-TTMG

[Wikidata-bugs] [Maniphest] [Updated] T178530: Improve field mapping for nginx logstash

2019-08-19 Thread fgiunchedi
fgiunchedi added a project: observability. TASK DETAIL https://phabricator.wikimedia.org/T178530 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: debt, fgiunchedi Cc: Stashbot, gerritbot, dcausse, Gehel, EBernhardson, Aklapper, Smalyshev, Hook696

[Wikidata-bugs] [Maniphest] [Commented On] T221774: Add Wikidata query service lag to Wikidata maxlag

2019-05-09 Thread fgiunchedi
fgiunchedi added a comment. In T221774#5155621 <https://phabricator.wikimedia.org/T221774#5155621>, @hoo wrote: > Possible way to do this: > > Create `PrometheusBlazegraphLagService` class which internally fetches the lag from a given Blazegraph instance li

[Wikidata-bugs] [Maniphest] [Edited] T187960: Rack/cable/configure asw2-a-eqiad switch stack

2019-03-05 Thread fgiunchedi
fgiunchedi updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T187960 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Cmjohnson, fgiunchedi Cc: akosiaris, Joe, fgiunchedi, hashar, Krinkle, ArielGlenn, jijiki, Addshore

[Wikidata-bugs] [Maniphest] [Commented On] T208215: Metrics from wdqs updater JMX should be prefixed

2018-12-12 Thread fgiunchedi
fgiunchedi added a comment. Any update?TASK DETAILhttps://phabricator.wikimedia.org/T208215EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: fgiunchediCc: fgiunchedi, Aklapper, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune

[Wikidata-bugs] [Maniphest] [Created] T208215: Metrics from wdqs updater JMX should be prefixed

2018-10-29 Thread fgiunchedi
fgiunchedi created this task.fgiunchedi added a project: Wikidata-Query-Service.Restricted Application added a subscriber: Aklapper.Restricted Application added a project: Wikidata. TASK DESCRIPTIONNoticed this while investigating something else, metrics exposed by jmx_exporter running on wdqs

[Wikidata-bugs] [Maniphest] [Commented On] T195121: Contribution from the IGN to Structured Data on Commons

2018-08-27 Thread fgiunchedi
fgiunchedi added a comment. In T195121#4529715, @aborrero wrote: In T195121#4527757, @fgiunchedi wrote: Sounds like a nice project! With my swift maintainer hat on, testing a single 200-300 GB chunk of data sounds good to me. Let's coordinate though before uploading the full data set because

[Wikidata-bugs] [Maniphest] [Updated] T195121: Contribution from the IGN to Structured Data on Commons

2018-08-23 Thread fgiunchedi
fgiunchedi added a comment. Sounds like a nice project! With my swift maintainer hat on, testing a single 200-300 GB chunk of data sounds good to me. Let's coordinate though before uploading the full data set because swift is pending its annual expansion (T201937) and I'd like to have

[Wikidata-bugs] [Maniphest] [Unblock] T195520: Multiple projects reporting Cannot access the database: No working replica DB server

2018-07-26 Thread fgiunchedi
fgiunchedi closed subtask T195530: status.wikimedia.org showing all lights green during major outage as "Invalid". TASK DETAILhttps://phabricator.wikimedia.org/T195520EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Addshore, fgiunchediCc

[Wikidata-bugs] [Maniphest] [Updated] T192768: wdqs-updater crashing not cleanly

2018-04-24 Thread fgiunchedi
fgiunchedi added a comment. No planned upgrades ATM, though a newer upstream version might help with understanding (hopefully fixing) T192456: Prometheus metrics missing for some hosts too, so definitely welcome!TASK DETAILhttps://phabricator.wikimedia.org/T192768EMAIL PREFERENCEShttps

[Wikidata-bugs] [Maniphest] [Triaged] T186815: Badges not displaying on trwiki

2018-02-12 Thread fgiunchedi
fgiunchedi triaged this task as "Normal" priority. TASK DETAILhttps://phabricator.wikimedia.org/T186815EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: fgiunchediCc: Superyetkin, Aklapper, Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v

[Wikidata-bugs] [Maniphest] [Closed] T184434: prometheus-blazegraph-exporter failing to start after reboot

2018-01-15 Thread fgiunchedi
fgiunchedi closed this task as "Resolved".fgiunchedi claimed this task.fgiunchedi added a comment. Done, fix deployedTASK DETAILhttps://phabricator.wikimedia.org/T184434EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: fgiunchediCc: fgiunchedi,

[Wikidata-bugs] [Maniphest] [Updated] T147328: Add http://tools.wmflabs.org/grafana-json-datasource as a datasource to production grafana instance

2017-07-06 Thread fgiunchedi
fgiunchedi added a project: User-fgiunchedi.fgiunchedi added a comment. @Addshore yes! I'll try taking a look in the next couple of weeks I thinkTASK DETAILhttps://phabricator.wikimedia.org/T147328EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Addshore

[Wikidata-bugs] [Maniphest] [Triaged] T160685: Increase $wgExpensiveParserFunctionLimit on nowiki

2017-04-12 Thread fgiunchedi
fgiunchedi triaged this task as "Normal" priority. TASK DETAILhttps://phabricator.wikimedia.org/T160685EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: fgiunchediCc: Lydia_Pintscher, Krinkle, Reedy, jeblad, Aklapper, Th3d3v1ls, Hfbn0, QZanden,

[Wikidata-bugs] [Maniphest] [Triaged] T150356: Wikidata Query Service is overly verbose toward logstash

2016-11-29 Thread fgiunchedi
fgiunchedi triaged this task as "Normal" priority. TASK DETAILhttps://phabricator.wikimedia.org/T150356EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: fgiunchediCc: Smalyshev, Gehel, Aklapper, EBjune, mschwarzer, Avner, Zppix, debt, D3r1ck01, Jonas

[Wikidata-bugs] [Maniphest] [Commented On] T147329: Add simple-json-datasource plugin to productrion grafana instance

2016-10-27 Thread fgiunchedi
fgiunchedi added a comment. Merged, though following up from IRC: we're keeping grafana labs/prod segregated in their datasources to avoid introducing more production/labs dependencies. It'd be nice if the tool ran somewhere in production. AFAICS it is stateless so that shouldn't be too hard

[Wikidata-bugs] [Maniphest] [Triaged] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-04-28 Thread fgiunchedi
fgiunchedi triaged this task as "Normal" priority. TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: Bovlb, Aklapper, Mushroom, Avner, debt, Gehel, D3r1ck01, FloNight, Izn

[Wikidata-bugs] [Maniphest] [Commented On] T119579: Additional diskspace of wdqs1001/wdqs1002

2015-12-04 Thread fgiunchedi
fgiunchedi added a comment. drive-by comment: partman will likely to be adjusted too so we don't run into surprises when reprovisioning TASK DETAIL https://phabricator.wikimedia.org/T119579 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RobH

[Wikidata-bugs] [Maniphest] [Closed] T118850: Delete daily.wikidata.api.getclaims_property_use.* Graphite metrics

2015-12-02 Thread fgiunchedi
fgiunchedi added a subscriber: fgiunchedi. fgiunchedi closed this task as "Resolved". fgiunchedi claimed this task. fgiunchedi added a comment. {{done}} TASK DETAIL https://phabricator.wikimedia.org/T118850 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailp

[Wikidata-bugs] [Maniphest] [Closed] T118836: Delete wikibase.dispatch.* metrics

2015-12-02 Thread fgiunchedi
fgiunchedi added a subscriber: fgiunchedi. fgiunchedi closed this task as "Resolved". fgiunchedi claimed this task. fgiunchedi added a comment. {{done}} TASK DETAIL https://phabricator.wikimedia.org/T118836 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailp

[Wikidata-bugs] [Maniphest] [Triaged] T119579: Additional diskspace of wdqs1001/wdqs1002

2015-12-01 Thread fgiunchedi
fgiunchedi triaged this task as "Normal" priority. fgiunchedi added a subscriber: fgiunchedi. TASK DETAIL https://phabricator.wikimedia.org/T119579 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: fgiunchedi Cc: fgiunchedi, hoo, Akl

[Wikidata-bugs] [Maniphest] [Commented On] T117735: Track all Wikidata metrics currently gathered in Graphite rather than SQL and TSVs

2015-11-11 Thread fgiunchedi
fgiunchedi added a subscriber: fgiunchedi. fgiunchedi added a comment. thanks for expanding on that, here's my (as the person who's been looking after our graphite stack) opinion: - graphite isn't really data warehouse, thus I wouldn't recommend it as the primary storage for the verbatim

[Wikidata-bugs] [Maniphest] [Commented On] T117732: Create a Graphite instance in the Analytics cluster

2015-11-09 Thread fgiunchedi
fgiunchedi added a comment. @addshore to clarify, more than functionality I was pointing out guarantees about the data stored. if the metrics are also being archived to hdfs for example so it is possible to dump/load into graphite then IMO that's acceptable. re: analytics graphite instance, I

[Wikidata-bugs] [Maniphest] [Commented On] T117402: Enable retention of daily metrics for longer periods of time in Graphite

2015-11-03 Thread fgiunchedi
fgiunchedi added a subscriber: fgiunchedi. fgiunchedi added a comment. for long term data warehousing or analytics type of workflows using ourhadoop/analytics infrastructure will be more appropriate I think. graphite is more focused on operational metrics from applications, services and so

[Wikidata-bugs] [Maniphest] [Assigned] T95679: Make a puppet role that sets up a query service and loads it

2015-07-20 Thread fgiunchedi
fgiunchedi added a subscriber: fgiunchedi. fgiunchedi assigned this task to GLavagetto. fgiunchedi added a comment. moving to @joe TASK DETAIL https://phabricator.wikimedia.org/T95679 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GLavagetto

[Wikidata-bugs] [Maniphest] [Reassigned] T95679: Make a puppet role that sets up a query service and loads it

2015-07-20 Thread fgiunchedi
fgiunchedi reassigned this task from GLavagetto to Joe. fgiunchedi added a subscriber: GLavagetto. TASK DETAIL https://phabricator.wikimedia.org/T95679 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Joe, fgiunchedi Cc: GLavagetto, fgiunchedi

[Wikidata-bugs] [Maniphest] [Commented On] T84902: deploy haedus and capella with os for orientdb testing

2015-03-30 Thread fgiunchedi
fgiunchedi added a subscriber: fgiunchedi. fgiunchedi added a comment. looks like this is completed, anything else left @joe ? TASK DETAIL https://phabricator.wikimedia.org/T84902 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign username