[Wikidata-bugs] [Maniphest] T349118: Migrate node-based services in production to node18

2024-06-03 Thread Ottomata
Ottomata updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T349118 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: VPuffetMichel, Sbailey, WMDE-leszek, KartikMistry, Michael, gmodena, Ottomata, elukey

[Wikidata-bugs] [Maniphest] T349118: Migrate node-based services in production to node18

2024-06-03 Thread Ottomata
Ottomata updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T349118 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: VPuffetMichel, Sbailey, WMDE-leszek, KartikMistry, Michael, gmodena, Ottomata, elukey

[Wikidata-bugs] [Maniphest] T344027: Validation Error for eventlogging_WMDEBannerSizeIssue

2023-11-02 Thread Ottomata
Ottomata merged a task: T349702: eventlogging_WMDEBannerSizeIssue validation errors. Ottomata added subscribers: Ottomata, Ahoelzl, JAllemandou, tchin, gabriel-wmde, kai.nissen, gmodena. TASK DETAIL https://phabricator.wikimedia.org/T344027 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T342593: Five deleted Wikidata items pertaining to Wikimedia category pages still present in the Query Service

2023-10-03 Thread Ottomata
Ottomata added a comment. > I agree with @Milimetric here and we need to get a better sense of the quality of the EventBus/EventGate system T345195: [SPIKE] Can we identify indicators to inform an SLO for event emission and intake? <https://phabricator.wikimedia.org/T345195&

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API

2023-06-28 Thread Ottomata
Ottomata added a comment. ^ sounds good! `wikidata` could be an okay name too. I like functional groupings. However, that might overlap with some dags that the search platform team does for wikidata in their `search` instance. ¯\_(ツ)_/¯ TASK DETAIL https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T334558: [Analytics] Unique user-agents accessing Wikidata's REST API

2023-06-28 Thread Ottomata
Ottomata added a comment. We should avoid team names in functional code / namespacing. Team names change often. However, wmde is more like 'wmf' in this case, and I think not likely to change. +1 to wmde. :) TASK DETAIL https://phabricator.wikimedia.org/T334558 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-02-02 Thread Ottomata
Ottomata updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T326409 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: BTullis, JMeybohm, gmodena, Ottomata, bking, Aklapper, dcausse, Themindcoder, Adamm71, Jersione

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-01-26 Thread Ottomata
Ottomata added a subscriber: BTullis. Ottomata added a comment. > create a namespace for the rdf-streaming-updater on the dse-k8s cluster BTW, I _think_ there is more involved than just this Puppet patch? Reach out to @BTullis ? TASK DETAIL https://phabricator.wikimedia.org/T326

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-01-20 Thread Ottomata
Ottomata added a comment. > What namespace strategy should we use for flink jobs? A single one for all wmf flink jobs, per team, per project? One per application is what I'm expecting. TASK DETAIL https://phabricator.wikimedia.org/T326409 EMAIL PREFERENCES ht

[Wikidata-bugs] [Maniphest] T298305: Realtime editing UI and API

2022-08-11 Thread Ottomata
Ottomata added a comment. I'm not sure where this task belongs. What UI and API are you refering to? See also: https://stream.wikimedia.org/?doc https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams TASK DETAIL https://phabricator.wikimedia.org/T298305 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T314835: wdqs space usage on thanos-swift

2022-08-09 Thread Ottomata
Ottomata added subscribers: dcausse, Ottomata. Ottomata added a comment. Hi, I don't know much about this, but I did a little bit of digging. I can see that the flink session cluster jobmanager is taking checkpoints every few seconds, for each of the jobs it is running

[Wikidata-bugs] [Maniphest] T294133: Expose rdf-streaming-updater.mutation content through EventStreams

2022-02-10 Thread Ottomata
Ottomata added a comment. (Oh, past me said this already... :p) TASK DETAIL https://phabricator.wikimedia.org/T294133 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: RBrounley_WMF, Harej, Ottomata, Aklapper, dcausse, EChetty

[Wikidata-bugs] [Maniphest] T294133: Expose rdf-streaming-updater.mutation content through EventStreams

2022-02-10 Thread Ottomata
Ottomata added a comment. > Here you must consume only one. Actually, this is curious. These are really distinct streams. We probably should have named them differently. We can't just pick eqiad.rdf-streaming-updater.mutation when the user connects to EventStreams in eqiad, beca

[Wikidata-bugs] [Maniphest] T296470: Initialize WCQS production servers

2022-01-07 Thread Ottomata
Ottomata added a comment. Ran on main-eqiad and main-codfw kafka: kafka topics --create --topic eqiad.mediainfo-streaming-updater.mutation --replication-factor 3 --partitions 1 kafka configs --alter --entity-type topics --entity-name eqiad.mediainfo-streaming-updater.mutation

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2021-11-04 Thread Ottomata
Ottomata added a comment. ※\(^o^)/※ TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, Ottomata Cc: Mohammed_Sadat_WMDE, So9q, Lydia_Pintscher, VladimirAlexiev, karapayneWMDE, MPhamWMF

[Wikidata-bugs] [Maniphest] T294361: Events missing from event.rdf_streaming_updater_fetch_failure but present in /wmf/data/raw/event/eqiad.rdf-streaming-updater.fetch-failure

2021-10-28 Thread Ottomata
Ottomata added a comment. 3 our of your 7 records in this hour have meta.id == null (including this one), and Refine deduplication is removing all but one of them. You should set a unique value for `meta.id` in your events, but, arguably, Refine should not consider events with null

[Wikidata-bugs] [Maniphest] T294361: Events missing from event.rdf_streaming_updater_fetch_failure but present in /wmf/data/raw/event/eqiad.rdf-streaming-updater.fetch-failure

2021-10-28 Thread Ottomata
Ottomata added a comment. INTEResting. Will investigate. TASK DETAIL https://phabricator.wikimedia.org/T294361 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, JAllemandou, dcausse, Aklapper, EChetty, Invadibot

[Wikidata-bugs] [Maniphest] T294361: Events missing from event.rdf_streaming_updater_fetch_failure but present in /wmf/data/raw/event/eqiad.rdf-streaming-updater.fetch-failure

2021-10-28 Thread Ottomata
Ottomata claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T294361 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, JAllemandou, dcausse, Aklapper, EChetty, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz

[Wikidata-bugs] [Maniphest] T293195: Add MCR slot information to revision-create events

2021-10-27 Thread Ottomata
Ottomata added a comment. Done and deployed! TASK DETAIL https://phabricator.wikimedia.org/T293195 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, Ottomata Cc: odimitrijevic, Cparle, JAllemandou, Milimetric, Aklapper, Ottomata, Pchelolo

[Wikidata-bugs] [Maniphest] T293195: Add MCR slot information to revision-create events

2021-10-26 Thread Ottomata
Ottomata added a comment. I was about to merge that today but then thought that your suggestion to ensure that properties validate with the additionalProperties stuff would be good to add first. So you could implement that :D :D TASK DETAIL https://phabricator.wikimedia.org/T293195

[Wikidata-bugs] [Maniphest] T294133: Expose rdf-streaming-updater.mutation content through EventStreams

2021-10-25 Thread Ottomata
Ottomata added a comment. Oh, we'll also want to add create event schema and add it to the schema repo. TASK DETAIL https://phabricator.wikimedia.org/T294133 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, Aklapper, dcausse

[Wikidata-bugs] [Maniphest] T294133: Expose rdf-streaming-updater.mutation content through EventStreams

2021-10-25 Thread Ottomata
Ottomata added projects: Analytics, Event-Platform. Ottomata added a comment. > Here you must consume only one. Should we expose both? We'll need to declare these streams in EventStreammConfig, likely each as a distinct stream overriding the list of topics that make up the str

[Wikidata-bugs] [Maniphest] T293195: Add MCR slot information to revision-create events

2021-10-13 Thread Ottomata
Ottomata added subscribers: Milimetric, JAllemandou. Ottomata added projects: Analytics, Event-Platform, Data-Engineering. TASK DETAIL https://phabricator.wikimedia.org/T293195 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: JAllemandou

[Wikidata-bugs] [Maniphest] T276469: Cookbooks and / or operation procedures are adapted for the new Flink based WDQS Streaming Updater

2021-09-23 Thread Ottomata
Ottomata added a comment. Just making sure yall have considered using consumer offset sync as supported by MirrorMaker 2 <https://blog.cloudera.com/a-look-inside-kafka-mirrormaker-2/>. Upgrading to MM2 would be a big deal, but we haven't done it because no one has

[Wikidata-bugs] [Maniphest] T291089: Proposal: Generate Wikidata JSON & RDF dumps from Hadoop

2021-09-15 Thread Ottomata
Ottomata added a comment. > I imagine other sources like https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams would all have the same problems? Yes, EventStreams uses the same data. TASK DETAIL https://phabricator.wikimedia.org/T291089 EMAIL PREFERENCES ht

[Wikidata-bugs] [Maniphest] T291089: Proposal: Generate Wikidata JSON & RDF dumps from Hadoop

2021-09-15 Thread Ottomata
Ottomata added a comment. > And the new query service flink updater could also make use of the RDF stream Perhaps the existing logic in the WDQS updater to generate its RDF stream could be factored out into its own service? Or, at least, it could emit its RDF stream as a side out

[Wikidata-bugs] [Maniphest] T291089: Proposal: Generate Wikidata JSON & RDF dumps from Hadoop

2021-09-15 Thread Ottomata
Ottomata added a comment. > a reliable and consistent input (such as MediaWiki recent changes) I guess by this you mean polling the MW RecentChanges API? TASK DETAIL https://phabricator.wikimedia.org/T291089 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/pa

[Wikidata-bugs] [Maniphest] T287641: Review wd_propertysuggester event logging stream config

2021-07-29 Thread Ottomata
Ottomata added a comment. Ok, if everything is working, you should be able to emit events in deployment-prep now. TASK DETAIL https://phabricator.wikimedia.org/T287641 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata

[Wikidata-bugs] [Maniphest] T287641: Review wd_propertysuggester event logging stream config

2021-07-29 Thread Ottomata
Ottomata added a comment. Thanks, +1! Actually this in labs so should be no prob to just merge and deploy. Will do that for ya now. TASK DETAIL https://phabricator.wikimedia.org/T287641 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata

[Wikidata-bugs] [Maniphest] T287641: Review wd_propertysuggester event logging stream config

2021-07-28 Thread Ottomata
Ottomata added a comment. Left some comments! :) TASK DETAIL https://phabricator.wikimedia.org/T287641 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, Martaannaj, Michaelcochez, Aklapper, Invadibot, maantietaja, Akuckartz

[Wikidata-bugs] [Maniphest] T273901: Automate event stream ingestion into HDFS for streams that don't use EventGate

2021-07-27 Thread Ottomata
Ottomata closed this task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T273901 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Zbyszko, Mholloway, MPhamWMF, Gehel, Aklapper, JAllemandou, dcausse, Mstyles

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2021-07-27 Thread Ottomata
Ottomata closed subtask T273901: Automate event stream ingestion into HDFS for streams that dont use EventGate as Resolved. TASK DETAIL https://phabricator.wikimedia.org/T269619 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, Ottomata Cc

[Wikidata-bugs] [Maniphest] T273901: Automate event stream ingestion into HDFS for streams that don't use EventGate

2021-07-22 Thread Ottomata
Ottomata added a project: Analytics-Kanban. Ottomata added a comment. Yes! We've done this now that we are using Gobblin instead of Camus. Moving this to our Kanban so we can ACK and close it this week. TASK DETAIL https://phabricator.wikimedia.org/T273901 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T214362: RFC: Store WikibaseQualityConstraint check data in persistent storage

2021-03-19 Thread Ottomata
Ottomata added a comment. @Addshore I've quickly read the task description but I have to admit I don't fully understand it yet (what gets stored where, etc.). Find me on IRC? or set up quick little meeting? :) TASK DETAIL https://phabricator.wikimedia.org/T214362 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T276595: Upgrade prometheus-jmx-exporter

2021-03-11 Thread Ottomata
Ottomata edited projects, added Analytics-Clusters; removed Analytics. TASK DETAIL https://phabricator.wikimedia.org/T276595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, Aklapper, fgiunchedi, colewhite, Ramtin0071, MPhamWMF

[Wikidata-bugs] [Maniphest] T276595: Upgrade prometheus-jmx-exporter

2021-03-11 Thread Ottomata
Ottomata added a comment. Hello! Does Analytics have to upgrade too? :) TASK DETAIL https://phabricator.wikimedia.org/T276595 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, Aklapper, fgiunchedi, colewhite, Ramtin0071

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2021-03-04 Thread Ottomata
Ottomata added a comment. @Gehel @dcausse these events are now in HDFS. There aren't any Hive tables yet because no non-canary events have yet been ingested, and we filter out canary events from the Hive tables. TASK DETAIL https://phabricator.wikimedia.org/T269619 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T273901: Automate event stream ingestion into HDFS for streams that don't use EventGate

2021-03-03 Thread Ottomata
Ottomata added a subscriber: Mholloway. Ottomata added a comment. Ok here's the idea: I add a new EventStreamConfig settings block called `consumers`, where we can add consumers by name, and then put in relevant settings for them. Those consumers would be responsible for using those

[Wikidata-bugs] [Maniphest] T273901: Automate event stream ingestion into HDFS for streams that don't use EventGate

2021-02-04 Thread Ottomata
Ottomata created this task. Ottomata added projects: Wikidata-Query-Service, Analytics, Event-Platform. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION Currently, we configure Camus jobs to import events based on which

[Wikidata-bugs] [Maniphest] T270371: wikimedia-event-utilities should provide tools for JVM based apps producing directly to kafka

2020-12-17 Thread Ottomata
Ottomata updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T270371 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, dcausse, Aklapper, MPhamWMF, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz

[Wikidata-bugs] [Maniphest] T270371: wikimedia-event-utilities should provide tools for JVM based apps producing directly to kafka

2020-12-17 Thread Ottomata
Ottomata added a project: Event-Platform. TASK DETAIL https://phabricator.wikimedia.org/T270371 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, dcausse, Aklapper, MPhamWMF, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-17 Thread Ottomata
Ottomata added a comment. T270371 <https://phabricator.wikimedia.org/T270371> is great thank you! > If this kind of usecases is something we'd like to support We do. EventGate is a poor substitute for a full fledged Kafka client. Kafka has so many more features and knob

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-16 Thread Ottomata
Ottomata added a comment. It depends on what you want to do :) EventGate will handle multi DC, filling some default values, and topic prefixes for you, but is an extra hop to Kafka. As a prod system in a language with a good Kafka client, producing to Kafka is totally allowed. You'd

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-16 Thread Ottomata
Ottomata added a comment. @dcausse, will these be POSTed to an EventGate, or to produced directly to Kafka? TASK DETAIL https://phabricator.wikimedia.org/T269619 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, Ottomata Cc: Ottomata

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-16 Thread Ottomata
Ottomata added a comment. > would it make sense to add some helper functions to wikimedia-event-utilities for validating a json against its schema Sure that could be useful :) TASK DETAIL https://phabricator.wikimedia.org/T269619 EMAIL PREFERENCES https://phabricator.wikimedia.

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-15 Thread Ottomata
Ottomata added a comment. @dcausse for retrieving schemas, https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master might help. :) TASK DETAIL https://phabricator.wikimedia.org/T269619 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] T266495: Create Debian Package for Flink

2020-10-26 Thread Ottomata
Ottomata added a subscriber: elukey. Ottomata added a comment. > apache bigtop (from cloudera hadoop). That includes flink debs, would we want to use that? OH HO! I did not know that. Hopefully that would be best! @elukey FYI TASK DETAIL https://phabricator.wikimedia.org/T266

[Wikidata-bugs] [Maniphest] T265525: Wikibase\Client\Usage\Sql\SqlSubscriptionManager::subscribe: Expected mass rollback of all peer transactions (DBO_TRX set)

2020-10-22 Thread Ottomata
Ottomata removed a project: Analytics. Ottomata added a comment. Hm, is there a reason Analytics was tagged? Looks like a JobQueue error? Removing analytics, feel free to re-add if needed. TASK DETAIL https://phabricator.wikimedia.org/T265525 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T265525: Wikibase\Client\Usage\Sql\SqlSubscriptionManager::subscribe: Expected mass rollback of all peer transactions (DBO_TRX set)

2020-10-22 Thread Ottomata
Ottomata added a project: Platform Engineering. TASK DETAIL https://phabricator.wikimedia.org/T265525 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Aklapper, thcipriani, Akuckartz, 4748kitoko, darthmon_wmde, WDoranWMF, holger.knust

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-20 Thread Ottomata
Ottomata added a comment. Ok, it looks like we need to allow traffic to 10.2.2.54 and 10.2.1.54 from the Analytics VLAN. @elukey can you add that? TY! TASK DETAIL https://phabricator.wikimedia.org/T246004 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-19 Thread Ottomata
Ottomata added a comment. I don't know! @fgiunchedi how does one access the cluster? @elukey can check the network VLAN ACLs and update accordingly. TASK DETAIL https://phabricator.wikimedia.org/T246004 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T262942: PoC on anomaly detection with Flink

2020-10-12 Thread Ottomata
Ottomata added a comment. Hm, I think we need to find a use case that can be done using Stream SQL repl. I don't think SRE will deploy a Java app e.g. during a DDoS. Can Flink's SQL REPL do something like https://www.confluent.io/stream-processing-cookbook/ksql-recipes/syslog-pattern

[Wikidata-bugs] [Maniphest] [Commented On] T255399: Prepare wdqs1009 to run the streaming updater

2020-06-16 Thread Ottomata
Ottomata added a comment. Done: kafka configs --alter --entity-type topics --entity-name wdqs_streaming_updater_test --add-config retention.ms=267840 TASK DETAIL https://phabricator.wikimedia.org/T255399 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Commented On] T253753: Increase retention for mediawiki.revision-create on the kafka jumbo cluster

2020-05-27 Thread Ottomata
Ottomata added a comment. Did 31 days: $ kafka topics --alter --topic eqiad.mediawiki.revision-create --config retention.ms=267840 $ kafka topics --alter --topic codfw.mediawiki.revision-create --config retention.ms=267840 $ kafka topics --describe --topic

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-10 Thread Ottomata
Ottomata added a comment. While not a google doc, the parent ticket's description describes it pretty well: T244590: EPIC: Rework the WDQS updater as an event driven application <https://phabricator.wikimedia.org/T244590> TASK DETAIL https://phabricator.wikimedia.org/T247058

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-06 Thread Ottomata
Ottomata added a comment. A nice feature of Flink is its support for both batch and stream processing. Ideally, we'd be able to build lambda architectures <https://en.wikipedia.org/wiki/Lambda_architecture> reusing most of the core data logic for streams and historical batch backf

[Wikidata-bugs] [Maniphest] [Updated] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-06 Thread Ottomata
Ottomata added a project: Analytics. TASK DETAIL https://phabricator.wikimedia.org/T247058 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Milimetric, JAllemandou, Ottomata, Pchelolo, Joe, Aklapper, dcausse, Zbyszko, Gehel, 4748kitoko

[Wikidata-bugs] [Maniphest] [Commented On] T247058: Deployment strategy and hardware requirement for new Flink based WDQS updater

2020-03-06 Thread Ottomata
Ottomata added a comment. Ping also @Pchelolo for comments on ^ TASK DETAIL https://phabricator.wikimedia.org/T247058 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: Ottomata, Pchelolo, Joe, Aklapper, dcausse, Zbyszko, Gehel

[Wikidata-bugs] [Maniphest] [Created] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-02-24 Thread Ottomata
Ottomata created this task. Ottomata added projects: Epic, Wikidata-Query-Service, Wikidata. TASK DESCRIPTION TASK DETAIL https://phabricator.wikimedia.org/T246004 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: #analytics, dcausse

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T244590: EPIC: Rework the WDQS updater as an event driven application

2020-02-18 Thread Ottomata
Ottomata added subscribers: JAllemandou, Ottomata. Ottomata added a comment. COOL! :) > it's important to note that the state of step 3 is tightly coupled with its dump and thus we will have to instantiate a new stream per imported dump. In other words a wdqs system imported using d

[Wikidata-bugs] [Maniphest] [Retitled] T209655: Copy Wikidata dumps to HDFS

2019-11-07 Thread Ottomata
Ottomata renamed this task from "Copy Wikidata dumps to HDFs" to "Copy Wikidata dumps to HDFS". TASK DETAIL https://phabricator.wikimedia.org/T209655 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: WMDE-leszek, ab

[Wikidata-bugs] [Maniphest] [Commented On] T101013: Log Wikidata Query Service queries to the event gate infrastructure

2019-11-06 Thread Ottomata
Ottomata added a comment. > Sorry about this stupid queue name No worries, its beta ┐|・ิω・ิ#|┌ TASK DETAIL https://phabricator.wikimedia.org/T101013 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, Ottomata Cc: Igorkim78, JAlleman

[Wikidata-bugs] [Maniphest] [Assigned] T236500: large number of 504 errors from ulsfo

2019-10-29 Thread Ottomata
Ottomata assigned this task to ema. Ottomata added a comment. Ema looks like you are working on this. Assigning to you as part of clinic duty. Feel free to resolve if done. TASK DETAIL https://phabricator.wikimedia.org/T236500 EMAIL PREFERENCES https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] [Commented On] T101013: Log Wikidata Query Service queries to the event gate infrastructure

2019-10-23 Thread Ottomata
Ottomata added a comment. sparql/query schema is merged. We'll need to do an eventgate-analytics k8s deploy before it can be used. Let me know when you want to start testing this. We can also update eventgate-analytics in deployment-prep first if you can/want to test there. TASK DETAIL

[Wikidata-bugs] [Maniphest] [Commented On] T101013: Log Wikidata Query Service queries to the event gate infrastructure

2019-10-10 Thread Ottomata
Ottomata added a comment. > do we need to do something on the refinery/hadoop side to create the hive table Depending on the name of the stream, it will probably need whitelisted in a camus a refine job. > and/or add a purge mechanism for the 90days retention I b

[Wikidata-bugs] [Maniphest] [Updated] T101013: Log Wikidata Query Service queries to the event gate infrastructure

2019-10-08 Thread Ottomata
Ottomata added a project: EventBus. Restricted Application added a project: Analytics. TASK DETAIL https://phabricator.wikimedia.org/T101013 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, Ottomata Cc: Ottomata, Smalyshev, Deskana, Aklapper

[Wikidata-bugs] [Maniphest] [Commented On] T101013: Log Wikidata Query Service queries to the event gate infrastructure

2019-10-08 Thread Ottomata
Ottomata added a comment. Q: I assume you'll want this to go into the eventgate-analytics instance, yes? I.e. You won't be building production services on this stream, just using it for offline analytics (well, offline performance testing). TASK DETAIL https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] [Updated] T176875: Allow access to wdqs.svc.eqiad.wmnet on port 8888

2019-07-09 Thread Ottomata
Ottomata added a comment. Ah, hm ok. Actually, @elukey why can't we allow the VIP IP? We did this in T221690: Allow analytics VLAN to reach schema.svc.$site.wmnet <https://phabricator.wikimedia.org/T221690>, no? TASK DETAIL https://phabricator.wikimedia.org/T176875

[Wikidata-bugs] [Maniphest] [Commented On] T176875: Allow access to wdqs.svc.eqiad.wmnet on port 8888

2019-07-09 Thread Ottomata
Ottomata added a comment. @Addshore, just saw T218710 <https://phabricator.wikimedia.org/T218710> and clicked through to here. If you use https://wikitech.wikimedia.org/wiki/HTTP_proxy, you can access wdqs.svc.eqiad.wmnet over HTTP from the analytics VLAN. TASK DETAIL

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T225195: EventBus jobs failing heavily because of CentralNotice and WikibaseRepo

2019-06-06 Thread Ottomata
Ottomata added a subscriber: Umherirrender. Ottomata added a comment. @umherirrender I'd like to revert https://gerrit.wikimedia.org/r/c/integration/config/+/513663 as it is causing blockers for EventBus changes. Objections? TASK DETAIL https://phabricator.wikimedia.org/T225195 EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T225195: EventBus jobs failing heavily because of CentralNotice and WikibaseRepo

2019-06-06 Thread Ottomata
Ottomata added a comment. The failures I see in https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-hhvm-docker/52212/console don't really have to do with nulls passed to EventBus though. They just look failing CentralNotice tests, possible due to bad fixture data? E.g

[Wikidata-bugs] [Maniphest] [Commented On] T225195: EventBus jobs failing heavily because of CentralNotice and WikibaseRepo

2019-06-06 Thread Ottomata
Ottomata added a comment. So I suppose since those patches are in two different repos, tests on each fail until they are both merged? I think we should merge them. This is blocking other EventBus patches too. TASK DETAIL https://phabricator.wikimedia.org/T225195 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T215413: Image Classification Working Group

2019-05-30 Thread Ottomata
Ottomata added a comment. Not sure if this is relevant, but this seemed the best place to note. I just came across: https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_YARN It seems relatively easy to package up (e.g. on a notebook host) and ship to hdfs and then include

[Wikidata-bugs] [Maniphest] [Commented On] T212550: Implement support for ChronologyProtection in events sent when editing Mediawiki/Wikidata

2019-04-30 Thread Ottomata
Ottomata added a comment. I guess my comment here got missed? https://gerrit.wikimedia.org/r/c/mediawiki/core/+/505328#message-f9c4c6e4cf0235e8dfa743ced1dc17ac6ca1dd21 TASK DETAIL https://phabricator.wikimedia.org/T212550 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2019-04-24 Thread Ottomata
Ottomata added a comment. Can we close this task? TASK DETAIL https://phabricator.wikimedia.org/T161731 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: gerritbot, JAllemandou, Pchelolo, Ladsgroup, Nuria, Anomie, Aklapper, Smalyshev

[Wikidata-bugs] [Maniphest] [Commented On] T212550: Implement support for ChronologyProtection in RDF export

2019-04-18 Thread Ottomata
Ottomata added a comment. Hm, can we add some more documentation, especially to the field in the event schema? I'm not familiar with ChronologyProtector, and googling around for docs isn't yielding much. TASK DETAIL https://phabricator.wikimedia.org/T212550 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] [Commented On] T214706: Surface link changes as a stream

2019-02-12 Thread Ottomata
Ottomata added a comment. Just saw some errors in prod Failed processing event: Topic mediawiki.page-links-change not configured ^ should fix.TASK DETAILhttps://phabricator.wikimedia.org/T214706EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T209655: Copy Wikidata dumps to HDFs

2018-12-06 Thread Ottomata
Ottomata added subscribers: Nuria, Ottomata.Ottomata added a comment. @Nuria we should fit this in somewhere! Maybe a Q3 goal? :DTASK DETAILhttps://phabricator.wikimedia.org/T209655EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc: Ottomata, Nuria

[Wikidata-bugs] [Maniphest] [Commented On] T210702: Verify that EventStreams work with WikiBase MediaInfo

2018-11-29 Thread Ottomata
Ottomata added a comment. Cool! EventStreams is deployed in beta, however, real production data does not go there. If your code that generates events is deployed to beta, you should be able to consume those event triggered by actions in beta from stream-beta.wmflabs.org. How will your events

[Wikidata-bugs] [Maniphest] [Commented On] T210451: Kafka eqiad.mediawiki.page-delete topic is empty

2018-11-26 Thread Ottomata
Ottomata added a comment. Oo, I just did the same, or, at least I copied the relevant files. They are on stat1004:/home/otto/eventbus-validation-logs0. Stas said he might have another way so I stopped there.TASK DETAILhttps://phabricator.wikimedia.org/T210451EMAIL PREFERENCEShttps

[Wikidata-bugs] [Maniphest] [Commented On] T207817: WDQS Updater ran into issue and stopped working

2018-10-25 Thread Ottomata
Ottomata added a comment. Thanks @Smalyshev, I think you are write that changes like this should be announced a bit better. We'll try to do better next time. I wonder...where should we announce changes like this? engineering@? wikitech-l?TASK DETAILhttps://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] [Commented On] T207817: WDQS Updater ran into issue and stopped working

2018-10-24 Thread Ottomata
Ottomata added a comment. Maybe jackson just can't parse this microsecond stuff? Maybe milliseconds are fine?TASK DETAILhttps://phabricator.wikimedia.org/T207817EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc: Ottomata, Pchelolo, Tarrow, WMDE

[Wikidata-bugs] [Maniphest] [Updated] T207817: WDQS Updater ran into issue and stopped working

2018-10-24 Thread Ottomata
Ottomata added subscribers: Pchelolo, Ottomata.Ottomata added a comment. Ping @Pchelolo. This is happening because of https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/EventBus/+/468482/ as part of a fix for T207329: Clear watchlist on enwiki only removes 50 items at a time. Interesting

[Wikidata-bugs] [Maniphest] [Commented On] T204415: Query stats dashboard not updating

2018-09-24 Thread Ottomata
Ottomata added a comment. Ok, I've added the analytics-search system user to the analytics-search-users group. You should make your script chgrp analytics-search-users after it creates it.TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] [Commented On] T204415: Query stats dashboard not updating

2018-09-24 Thread Ottomata
Ottomata added a comment. Oh sorry, misunderstood. Yes we should be able to make the output of the file writable by you somehow.TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: mpopov, OttomataCc: gerritbot

[Wikidata-bugs] [Maniphest] [Commented On] T204415: Query stats dashboard not updating

2018-09-24 Thread Ottomata
Ottomata added a comment. @mpopov, since that file is managed by Puppet, you'll have to make a puppet patch to change it!TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: mpopov, OttomataCc: gerritbot, Gehel

[Wikidata-bugs] [Maniphest] [Commented On] T204415: Query stats dashboard not updating

2018-09-24 Thread Ottomata
Ottomata added a comment. all the Hive queries (and related) should be using 'webrequest_text' from now on. e.g. WHERE webrequest_source = 'text'TASK DETAILhttps://phabricator.wikimedia.org/T204415EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Nuria

[Wikidata-bugs] [Maniphest] [Updated] T204415: Query stats dashboard not updating

2018-09-24 Thread Ottomata
Ottomata added a comment. See also T200822: Remove webrequest misc analytics related jobs and code after cache misc -> text merge is complete and T164609: Merge cache_misc into cache_text functionally. Sorry yall didn't know about this. I wonder if there is a better way we can config

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2018-06-25 Thread Ottomata
Ottomata added a comment. OO yes @Smalyshev and in case you didn't see, we also increased retention of mediawiki topics to 31 days in the main kafka clusters.TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences

[Wikidata-bugs] [Maniphest] [Updated] T187296: Increase kafka event retention to 14 or 21 days

2018-06-12 Thread Ottomata
Ottomata set the point value for this task to "2". TASK DETAILhttps://phabricator.wikimedia.org/T187296EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc: mforns, elukey, Ottomata, Aklapper, Nuria, Ladsgroup, Pchelolo, JAllemandou, Smaly

[Wikidata-bugs] [Maniphest] [Commented On] T187296: Increase kafka event retention to 14 or 21 days

2018-06-12 Thread Ottomata
Ottomata added a comment. Doing the following for all main-eqiad and main-codfw: for t in \ mediawiki.page-create\ mediawiki.page-delete\ mediawiki.page-edit \ mediawiki.page-move

[Wikidata-bugs] [Maniphest] [Retitled] T187296: Increase kafka event retention to 31

2018-06-12 Thread Ottomata
Ottomata renamed this task from "Increase kafka event retention to 14 or 21 days" to "Increase kafka event retention to 31".Ottomata claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T187296EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/pa

[Wikidata-bugs] [Maniphest] [Commented On] T187296: Increase kafka event retention to 14 or 21 days

2018-06-12 Thread Ottomata
Ottomata added a comment. mediawiki eventbus topics should now be retained for 31 days in main Kafka clusters. If we add a new mediawiki topic, we need to remember to run this command for it.TASK DETAILhttps://phabricator.wikimedia.org/T187296EMAIL PREFERENCEShttps://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] [Updated] T187296: Increase kafka event retention to 14 or 21 days

2018-06-12 Thread Ottomata
Ottomata added a project: Analytics-Kanban. TASK DETAILhttps://phabricator.wikimedia.org/T187296EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc: mforns, elukey, Ottomata, Aklapper, Nuria, Ladsgroup, Pchelolo, JAllemandou, Smalyshev, Lahi, Gq86

[Wikidata-bugs] [Maniphest] [Commented On] T187296: Increase kafka event retention to 14 or 21 days

2018-06-12 Thread Ottomata
Ottomata added a comment. I'll make this 31 days just to bump it up to a month. We have plenty of space for this.TASK DETAILhttps://phabricator.wikimedia.org/T187296EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc: mforns, elukey, Ottomata

[Wikidata-bugs] [Maniphest] [Updated] T187296: Increase kafka event retention to 14 or 21 days

2018-06-12 Thread Ottomata
Ottomata merged a task: T196409: Consider increasing retention for mediawiki event topics. TASK DETAILhttps://phabricator.wikimedia.org/T187296EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc: mforns, elukey, Ottomata, Aklapper, Nuria, Ladsgroup

[Wikidata-bugs] [Maniphest] [Commented On] T187296: Increase kafka event retention to 14 or 21 days

2018-02-20 Thread Ottomata
Ottomata added a comment. Is there a reason we want to do this on main instead of jumbo? Stas will be consuming from jumbo, since it has timestamp offset support.TASK DETAILhttps://phabricator.wikimedia.org/T187296EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences

[Wikidata-bugs] [Maniphest] [Commented On] T187296: Increase kafka event retention to 14 or 21 days

2018-02-14 Thread Ottomata
Ottomata added a comment. I think we can do this just for the mediawiki eventbus topics on the jumbo cluster.TASK DETAILhttps://phabricator.wikimedia.org/T187296EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc: Ottomata, Aklapper, Nuria, Ladsgroup

[Wikidata-bugs] [Maniphest] [Triaged] T178445: flapping monitoring for recommendation_api on scb

2018-01-16 Thread Ottomata
Ottomata triaged this task as "Normal" priority. TASK DETAILhttps://phabricator.wikimedia.org/T178445EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc: Joe, Volans, mobrovac, Smalyshev, Gehel, Stashbot, Aklapper, Dzahn, Qtn1293,

[Wikidata-bugs] [Maniphest] [Triaged] T181988: Investigate and improve memory allocation rates of WDQS

2018-01-16 Thread Ottomata
Ottomata triaged this task as "Normal" priority. TASK DETAILhttps://phabricator.wikimedia.org/T181988EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: OttomataCc: Aklapper, Smalyshev, Gehel, Qtn1293, Lahi, Gq86, Darkminds3113, Lucas_Werkme

  1   2   >