[Wikidata-bugs] [Maniphest] T336056: [ES-M2]: Add EntitySchema URI to RDF output

2023-06-07 Thread dcausse
dcausse added a comment. In T336056#8910743 <https://phabricator.wikimedia.org/T336056#8910743>, @dcausse wrote: > @hoo I tested this but some work (hopefully trivial) is required before the WDQS munger is able to interpret these URIs properly in the RDF output (created a s

[Wikidata-bugs] [Maniphest] T338352: Add support for EntitySchema URIs in the WDQS munger

2023-06-07 Thread dcausse
dcausse closed this task as "Declined". dcausse added a comment. It should support such URIs already, the warnings I saw were because I setup the test improperly. TASK DETAIL https://phabricator.wikimedia.org/T338352 EMAIL PREFERENCES https://phabricator.wikimedia.org/sett

[Wikidata-bugs] [Maniphest] T336056: [ES-M2]: Add EntitySchema URI to RDF output

2023-06-07 Thread dcausse
dcausse changed the status of subtask T338352: Add support for EntitySchema URIs in the WDQS munger from "Declined" to "Invalid". TASK DETAIL https://phabricator.wikimedia.org/T336056 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/

[Wikidata-bugs] [Maniphest] T338352: Add support for EntitySchema URIs in the WDQS munger

2023-06-07 Thread dcausse
dcausse changed the task status from "Declined" to "Invalid". TASK DETAIL https://phabricator.wikimedia.org/T338352 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: KEMONO_PANTSU_KEMONO_PANTSU_KEMONO_PANTSU_KEM

[Wikidata-bugs] [Maniphest] T336056: [ES-M2]: Add EntitySchema URI to RDF output

2023-06-07 Thread dcausse
dcausse closed subtask T338352: Add support for EntitySchema URIs in the WDQS munger as "Declined". TASK DETAIL https://phabricator.wikimedia.org/T336056 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: hoo, dcausse Cc: dcausse,

[Wikidata-bugs] [Maniphest] T336056: [ES-M2]: Add EntitySchema URI to RDF output

2023-06-07 Thread dcausse
dcausse added a comment. @hoo I tested this but some work (hopefully trivial) is required before the WDQS munger is able to interpret these URIs properly in the RDF output (created a subtask to track this work). TASK DETAIL https://phabricator.wikimedia.org/T336056 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T338352: Add support for EntitySchema URIs in the WDQS munger

2023-06-07 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata, Wikidata-Query-Service. TASK DESCRIPTION The WDQS munger should support wikibase RDF output that has an entity schema URI. The current implementation does not recognize these objects and warns about them with: 17:30

[Wikidata-bugs] [Maniphest] T338255: Lexemes (some?) are not properly indexed by CirrusSearch

2023-06-07 Thread dcausse
dcausse added a comment. In T338255#8908886 <https://phabricator.wikimedia.org/T338255#8908886>, @Nikki wrote: > In T338255#8908771 <https://phabricator.wikimedia.org/T338255#8908771>, @dcausse wrote: > >> Here is a list https://people.wikimedia.org/~dca

[Wikidata-bugs] [Maniphest] T338255: Lexemes (some?) are not properly indexed by CirrusSearch

2023-06-07 Thread dcausse
dcausse added a comment. Here is a list https://people.wikimedia.org/~dcausse/T338255-lexemes.tsv (they might not all be related to this problem but I suspect most of them are) TASK DETAIL https://phabricator.wikimedia.org/T338255 EMAIL PREFERENCES https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T338255: Lexemes (some?) are not properly indexed by CirrusSearch

2023-06-07 Thread dcausse
dcausse added a comment. Thanks for looking into this! We found this problem by analyzing the divergences between mysql and the elasticsearch index behind CirrusSearch, we found 24000 Lexemes (very high compared to other content models) that are not indexed, I can provide a list of page ids

[Wikidata-bugs] [Maniphest] T338255: Lexemes (some?) are not properly indexed by CirrusSearch

2023-06-06 Thread dcausse
dcausse created this task. dcausse added projects: CirrusSearch, Wikidata. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Discovery-Search. TASK DESCRIPTION Building the CirrusSearch document do seem to trigger an exception: `Caught exception of

[Wikidata-bugs] [Maniphest] T332953: Migrate PipelineLib repos to GitLab

2023-06-01 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T332953 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: jnuche, isarantopoulos, BPirkle, Tgr, Eevans, Seddon, MSantos, kevinbazira, odimitrijevic

[Wikidata-bugs] [Maniphest] T336709: Allow federated queries with the BNCF SPARQL endpoint

2023-05-22 Thread dcausse
dcausse removed EBernhardson as the assignee of this task. dcausse added a subscriber: EBernhardson. TASK DETAIL https://phabricator.wikimedia.org/T336709 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: EBernhardson, dcausse, Aklapper

[Wikidata-bugs] [Maniphest] T336799: Add https://digitale.bncf.firenze.sbn.it/openrdf-workbench/repositories/NS/query to WDQS federated endpoints

2023-05-16 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Requested by Epìdosis via https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input#BNCF TASK DETAIL https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T336134: wdqs2*** lagged for more than one day

2023-05-11 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T336134 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bking, dcausse Cc: bking, Aklapper, Bugreporter, Themindcoder, Adamm71, Jersione, Hellket777, LisafBia6531

[Wikidata-bugs] [Maniphest] T336134: wdqs2*** lagged for more than one day

2023-05-10 Thread dcausse
dcausse lowered the priority of this task from "Unbreak Now!" to "High". dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T336134 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bking, dcausse

[Wikidata-bugs] [Maniphest] T331405: Query service maxlag calculation should exclude datacenters that don't receive traffic and where the updater is turned off

2023-04-26 Thread dcausse
dcausse added a comment. @ItamarWMDE the patch finally got deployed, the prometheus query should look like: `max(time() - label_replace(blazegraph_lastupdated, "host", "$1", "instance", "^([^:]+):.*

[Wikidata-bugs] [Maniphest] T334823: Add https://opendata.aragon.es/sparql to the list of federated endpoints for WDQS and WCQS

2023-04-17 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Requested via https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#New_federated_query_service: >> I want t

[Wikidata-bugs] [Maniphest] T328675: Create a dse-k8s service demonstrating how to run the rdf-streaming-updater using the flink-app chart

2023-04-04 Thread dcausse
dcausse moved this task from In Progress to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. the flink job is properly running from the k8s-dse cluster using the flink-operator: https://grafana.wikimedia.org/d/gCFgfpG7k/flink-cluster?orgId=1&

[Wikidata-bugs] [Maniphest] T328675: Create a dse-k8s service demonstrating how to run the rdf-streaming-updater using the flink-app chart

2023-03-31 Thread dcausse
dcausse added a comment. currently still stuck with scala.MatchError: None (of class scala.None$) at org.wikidata.query.rdf.updater.SideOutputSerializationSchema.getRecordClock(SideOutputSerializationSchema.scala:31) at

[Wikidata-bugs] [Maniphest] T333373: The WDQS streaming updater should support connecting to kafka with SSL

2023-03-28 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T73 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, AWesterinen, MPhamWMF, CBogen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T333373: The WDQS streaming updater should support connecting to kafka with SSL

2023-03-28 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION The way we configure the kafka consumer assume that we pass the PLAINTEXT port. We should have a way to configure the security protocol to use

[Wikidata-bugs] [Maniphest] T331405: Query service maxlag calculation should exclude datacenters that don't receive traffic and where the updater is turned off

2023-03-28 Thread dcausse
dcausse added a comment. Restricted Application added a project: User-ItamarWMDE. @ItamarWMDE once https://gerrit.wikimedia.org/r/900729 is deployed we should be able to create a grafana query like the one suggested by Joe and adjust the threshold to double check that the query does what we

[Wikidata-bugs] [Maniphest] T297870: WDQS Streaming Updater fails with Timeout expired after 60000milliseconds while awaiting InitProducerId

2023-03-16 Thread dcausse
dcausse closed this task as "Declined". dcausse added a comment. We might have to switch to KafkaSource and this might change the behavior of flink during this kind of scenario. TASK DETAIL https://phabricator.wikimedia.org/T297870 EMAIL PREFERENCES https://phabricator.wik

[Wikidata-bugs] [Maniphest] T331271: Add https://data.europa.eu/sparql to WDQS federated services allow list

2023-03-13 Thread dcausse
dcausse claimed this task. dcausse moved this task from Incoming to To Be Deployed on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T331271 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T331405: Query service maxlag calculation should exclude datacenters that don't receive traffic and where the updater is turned off

2023-03-09 Thread dcausse
dcausse added a comment. WDQS lag issues should be rare now, a node not serving traffic should (as of today) be able to ingest ~8x the throughput that we usually see on wikidata so we should not worry about them. Using `blazegraph_queries_done_total` to identify nodes that should be part

[Wikidata-bugs] [Maniphest] T316882: RdfStreamingUpdaterHighConsumerUpdateLag alert is not fired

2023-03-09 Thread dcausse
dcausse closed this task as "Resolved". dcausse assigned this task to fgiunchedi. dcausse added a comment. can confirm it's fixed by the above patch, saw multiple alerts of this kind being fired during the k8s upgrade. Thanks! TASK DETAIL https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T331405: Depooled servers may still be taken into account for query service maxlag

2023-03-07 Thread dcausse
dcausse added a comment. In T331405#8672341 <https://phabricator.wikimedia.org/T331405#8672341>, @Joe wrote: > Updates shouldn't depend on where the discovery dns record points to, but rather go to the local datacenter directly. > > I think the bug here is with

[Wikidata-bugs] [Maniphest] T331271: Add https://data.europa.eu/sparql to WDQS federated services allow list

2023-03-06 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Requested via https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search <https://www.wikidata.org/w

[Wikidata-bugs] [Maniphest] T294133: Expose rdf-streaming-updater.mutation content through EventStreams

2023-02-24 Thread dcausse
dcausse added a parent task: T330521: Make WDQS update stream public. TASK DETAIL https://phabricator.wikimedia.org/T294133 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: RBrounley_WMF, Harej, Ottomata, Aklapper, dcausse, Mohamed

[Wikidata-bugs] [Maniphest] T330521: Make WDQS update stream public

2023-02-24 Thread dcausse
dcausse added a subtask: T294133: Expose rdf-streaming-updater.mutation content through EventStreams. TASK DETAIL https://phabricator.wikimedia.org/T330521 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Jheald, Gehel, KingsleyIdehen

[Wikidata-bugs] [Maniphest] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2023-02-23 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T241128 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Gehel, Addshore, dcausse, Aklapper, Astuthiodit_1, AWesterinen, BeautifulBold, Suran38

[Wikidata-bugs] [Maniphest] T328675: Create a dse-k8s service demonstrating how to run the rdf-streaming-updater using the flink-app chart

2023-02-14 Thread dcausse
dcausse claimed this task. dcausse moved this task from incoming to in progress on the Wikidata board. TASK DETAIL https://phabricator.wikimedia.org/T328675 WORKBOARD https://phabricator.wikimedia.org/project/board/71/ EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T328625: Federated queries with AGROVOC are not working

2023-02-13 Thread dcausse
dcausse moved this task from Ready for Dev -- SRE/Ops to To Be Deployed on the Discovery-Search (Current work) board. dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T328625 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T328675: Create a dse-k8s service demonstrating how to run the rdf-streaming-updater using the flink-app chart

2023-02-13 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T328675 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: pfischer, bking, dcausse, Aklapper, Themindcoder, Adamm71, Jersione, Hellket777, LisafBia6531

[Wikidata-bugs] [Maniphest] T329089: The rdf-streaming-updater does not reconcile missed page-undelete events

2023-02-09 Thread dcausse
dcausse moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. Q115608572 was restored after applying the fix, items that were not edited after being undeleted will sadly remain absent from WDQS until an edit is made or

[Wikidata-bugs] [Maniphest] T329089: The rdf-streaming-updater does not reconcile missed page-undelete events

2023-02-07 Thread dcausse
dcausse added a comment. The inconsistencies were properly detected by the updater: select * from rdf_streaming_updater_state_inconsistency where year=2023 AND month=01 AND day=12 AND meta.domain="www.wikidata.org" AND item="Q115608572" AND datacenter="codfw&

[Wikidata-bugs] [Maniphest] T329089: The rdf-streaming-updater does not reconcile missed page-undelete events

2023-02-07 Thread dcausse
dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T329089 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, Themindcoder, Adamm71, Jersione, Hellket777, LisafBia6531, Astuthiodit_1, AWesterinen

[Wikidata-bugs] [Maniphest] T329089: The rdf-streaming-updater does not reconcile missed page-undelete events

2023-02-07 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T329089 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, Themindcoder, Adamm71, Jersione, Hellket777, LisafBia6531

[Wikidata-bugs] [Maniphest] T329089: The rdf-streaming-updater does not reconcile missed page-undelete events

2023-02-07 Thread dcausse
dcausse added a comment. The reconcile batch job seems the one to blame, it reports: `23/01/12 03:29:25 INFO ReconcileCollector: Collected 0 inconsistencies from event.rdf_streaming_updater_state_inconsistency/datacenter=eqiad/year=2023/month=1/day=12/hour=0` While clearly there was

[Wikidata-bugs] [Maniphest] T329089: The rdf-streaming-updater does not reconcile missed page-undelete events

2023-02-07 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T329089 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, AWesterinen, MPhamWMF, CBogen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T329089: The rdf-streaming-updater does not reconcile missed page-undelete events

2023-02-07 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Receiving a revision-create events after page-delete events should be dealt as an inconsistency and be processed by the reconciliation job

[Wikidata-bugs] [Maniphest] T293063: Write and adapt Runbooks and cookbooks related to the WDQS Streaming Updater and kubernetes

2023-02-02 Thread dcausse
dcausse added a comment. In T293063#8582625 <https://phabricator.wikimedia.org/T293063#8582625>, @JMeybohm wrote: > Anyhow. AIUI this process will be more or less the same for flink deployments managed by the flink operator. It would be nice if you could verify this during y

[Wikidata-bugs] [Maniphest] T293063: Write and adapt Runbooks and cookbooks related to the WDQS Streaming Updater and kubernetes

2023-02-02 Thread dcausse
dcausse added a comment. In T293063#8582548 <https://phabricator.wikimedia.org/T293063#8582548>, @JMeybohm wrote: > Hey @dcausse, I'm reading this again because of the upcoming k8s 1.23 upgrade and was wondering: > In "To restore:" section of "Alter

[Wikidata-bugs] [Maniphest] T293063: Write and adapt Runbooks and cookbooks related to the WDQS Streaming Updater and kubernetes

2023-02-02 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T293063 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: akosiaris, RKemper, Gehel, bking, JMeybohm, Jelto, Aklapper, jijiki, dcausse, Astuthiodit_1

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-02-02 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T326409 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: BTullis, JMeybohm, gmodena, Ottomata, bking, Aklapper, dcausse, Adamm71, Jersione, Hellket777

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-02-02 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T326409 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: BTullis, JMeybohm, gmodena, Ottomata, bking, Aklapper, dcausse, Adamm71, Jersione, Hellket777

[Wikidata-bugs] [Maniphest] T328675: Create a dse-k8s service demonstrating how to run the rdf-streaming-updater using the flink-app chart

2023-02-02 Thread dcausse
dcausse added a parent task: T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model. TASK DETAIL https://phabricator.wikimedia.org/T328675 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: bking

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-02-02 Thread dcausse
dcausse added a subtask: T328675: Create a dse-k8s service demonstrating how to run the rdf-streaming-updater using the flink-app chart. TASK DETAIL https://phabricator.wikimedia.org/T326409 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc

[Wikidata-bugs] [Maniphest] T328675: Create a dse-k8s service demonstrating how to run the rdf-streaming-updater using the flink-app chart

2023-02-02 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION The flink-k8s-operator is not available on the dse-k8s cluster. We should create a new helmfile using the `flink-app` chart to run an

[Wikidata-bugs] [Maniphest] T289836: Upgrade the WDQS streaming updater to latest flink (1.16)

2023-01-30 Thread dcausse
dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board. dcausse added a comment. The updater with flink 1.16 did run for about 3 days in yarn, notable change is that it required 5g of mem to run without failure during backfills. Current

[Wikidata-bugs] [Maniphest] T322869: Fewer results from wdqs nodes running in codfw than eqiad

2023-01-25 Thread dcausse
dcausse added a comment. @Oravrattas thanks for the report, I've depooled these machines because these are new hosts that are not ready to serve user traffic yet (they have an empty/partially loaded database). TASK DETAIL https://phabricator.wikimedia.org/T322869 EMAIL PREFER

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-01-24 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T326409 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: JMeybohm, gmodena, Ottomata, bking, Aklapper, dcausse, Astuthiodit_1, AWesterinen, karapayneWMDE

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-01-24 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T326409 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: JMeybohm, gmodena, Ottomata, bking, Aklapper, dcausse, Astuthiodit_1, AWesterinen, karapayneWMDE

[Wikidata-bugs] [Maniphest] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2023-01-23 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T241128 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Gehel, Addshore, dcausse, Aklapper, Astuthiodit_1, AWesterinen, BeautifulBold, Suran38

[Wikidata-bugs] [Maniphest] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2023-01-23 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T241128 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Gehel, Addshore, dcausse, Aklapper, Astuthiodit_1, AWesterinen, BeautifulBold, Suran38

[Wikidata-bugs] [Maniphest] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2023-01-20 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T241128 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Gehel, Addshore, dcausse, Aklapper, Astuthiodit_1, AWesterinen, BeautifulBold, Suran38

[Wikidata-bugs] [Maniphest] T325730: Wikidata constraint check is getting throttled from wdqs-internal more than usual

2023-01-19 Thread dcausse
dcausse closed this task as "Declined". dcausse added a comment. Everything looks fine from my end! closing :) TASK DETAIL https://phabricator.wikimedia.org/T325730 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: bk

[Wikidata-bugs] [Maniphest] T289836: Upgrade the WDQS streaming updater to latest flink (1.16)

2023-01-13 Thread dcausse
dcausse renamed this task from "Upgrade the WDQS streaming updater to latest flink (1.15)" to "Upgrade the WDQS streaming updater to latest flink (1.16)". TASK DETAIL https://phabricator.wikimedia.org/T289836 EMAIL PREFERENCES https://phabricator.wikimedi

[Wikidata-bugs] [Maniphest] T326914: Migrate the WDQS streaming updater from FlinkKafkaConsumer/Producer to KafkaSource/Sink

2023-01-13 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION FlinkKafkaConsumer <https://nightlies.apache.org/flink/flink-docs-release-1.14/release-notes/flink-1.14/#deprecate-flinkkafkaconsumer>

[Wikidata-bugs] [Maniphest] T289836: Upgrade the WDQS streaming updater to latest flink (1.15)

2023-01-12 Thread dcausse
dcausse moved this task from Waiting to In Progress on the Discovery-Search (Current work) board. dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T289836 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-01-11 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T326409 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: bking, Aklapper, dcausse, Astuthiodit_1, AWesterinen, karapayneWMDE, Invadibot, MPhamWMF

[Wikidata-bugs] [Maniphest] T325992: Index sense glosses in CirrusSearch

2023-01-09 Thread dcausse
dcausse moved this task from Feature Requests to Wikibase Search on the Discovery-Search board. dcausse added a comment. This ticket is I think about finishing the work regarding the integration of WikibaseLexeme with CirrusSearch to index and query senses when performing a fulltext search

[Wikidata-bugs] [Maniphest] T326535: Automatically restart blazegraph when BlazegraphFreeAllocatorsDecreasingRapidly is about to be fired

2023-01-09 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T326535 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, AWesterinen, MPhamWMF, CBogen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T326535: Automatically restart blazegraph when BlazegraphFreeAllocatorsDecreasingRapidly is about to be fired

2023-01-09 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION The alert BlazegraphFreeAllocatorsDecreasingRapidly is fired when blazegraph starts to consume its free allocators to quickly, this currently

[Wikidata-bugs] [Maniphest] T289836: Upgrade the WDQS streaming updater to latest flink (1.15)

2023-01-06 Thread dcausse
dcausse added a parent task: T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model. TASK DETAIL https://phabricator.wikimedia.org/T289836 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-01-06 Thread dcausse
dcausse added a subtask: T289836: Upgrade the WDQS streaming updater to latest flink (1.15). TASK DETAIL https://phabricator.wikimedia.org/T326409 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, Astuthiodit_1

[Wikidata-bugs] [Maniphest] T326409: Migrate the wdqs streaming updater flink jobs to flink-k8s-operator deployment model

2023-01-06 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION The WDQS streaming updater is using a session cluster mode but the work done on T324576 <https://phabricator.wikimedia.org/T324576> should al

[Wikidata-bugs] [Maniphest] T326305: Misconfigured proxies on wdqs hosts

2023-01-05 Thread dcausse
dcausse added a project: Wikidata-Query-Service. TASK DETAIL https://phabricator.wikimedia.org/T326305 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, MoritzMuehlenhoff, Aklapper, jbond, bking, RKemper, ayounsi, AWesterinen

[Wikidata-bugs] [Maniphest] T326311: Deletion of Lexemes appears to leak triples related to its forms and senses

2023-01-05 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T326311 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Nikki, Aklapper, AWesterinen, MPhamWMF, CBogen, Namenlos314, Gq86

[Wikidata-bugs] [Maniphest] T326311: Deletion of Lexemes appears to leak triples related to its forms and senses

2023-01-05 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION In T302189 <https://phabricator.wikimedia.org/T302189> it was reported: In T302189#8501314 <https://phabricator.wikimedia.org/T30218

[Wikidata-bugs] [Maniphest] T302189: Regularly purge orphaned sitelink, value and reference nodes

2023-01-05 Thread dcausse
dcausse added a comment. In T302189#8501314 <https://phabricator.wikimedia.org/T302189#8501314>, @Nikki wrote: > This report of grammatical features <https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Statistics/Count_of_forms_by_grammatical_feature> is w

[Wikidata-bugs] [Maniphest] T325730: Wikidata constraint check is getting throttled from wdqs-internal more than usual

2022-12-21 Thread dcausse
dcausse renamed this task from "Wikidata constraint check is getting throttled from wdsq-internal more than usual" to "Wikidata constraint check is getting throttled from wdqs-internal more than usual". TASK DETAIL https://phabricator.wikimedia.org/T325730 EMAIL

[Wikidata-bugs] [Maniphest] T325730: Wikidata constraint check is getting throttled from wdsq-internal more than usual

2022-12-21 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata-Query-Service, Wikibase-Quality-Constraints. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Checking the telemetry metrics for jobrunner -> wdqs-internal we found weird patterns <https://grafana.wikimedia

[Wikidata-bugs] [Maniphest] T304914: Remove the presto client for swift from the flink image

2022-12-19 Thread dcausse
dcausse moved this task from Operations/SRE to Current work on the Wikidata-Query-Service board. dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T304914 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T289836: Upgrade the WDQS streaming updater to latest flink (1.15)

2022-12-19 Thread dcausse
dcausse added a project: Discovery-Search (Current work). dcausse moved this task from Scaling to Current work on the Wikidata-Query-Service board. TASK DETAIL https://phabricator.wikimedia.org/T289836 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS

2022-12-16 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T244341 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: ericP, JMinor, TomT0m, Gehel, Multichill, Pfps, Mmarx, Dipsacus_fullonum, Luitzen

[Wikidata-bugs] [Maniphest] T324345: Wikidata Query Service outputs 28th February for 29th February

2022-12-05 Thread dcausse
dcausse added a comment. Let's keep the WDQS tag on this ticket because if you change how the RDF output is generated we might have to re-import the data to WDQS to make sure that all these dates get cleaned-up. TASK DETAIL https://phabricator.wikimedia.org/T324345 EMAIL PREFER

[Wikidata-bugs] [Maniphest] T323239: Badges for sitelinks not getting updated in query service after a move

2022-11-23 Thread dcausse
dcausse added a comment. I believe this problem is similar to what was reported in https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#Updater_issue_? My understanding of this problem is as follow: The wikibase RDF for sitelinks uses the URL of the link

[Wikidata-bugs] [Maniphest] T322869: Fewer results from wdqs nodes running in codfw than eqiad

2022-11-10 Thread dcausse
dcausse added a comment. Recording few findings: checked 3 examples and they both relate to edits happening around `2022-05-05T18:03:00`: - https://www.wikidata.org/w/index.php?title=Q2053506&action=history (right after revision 1579954206) - https://www.wikidata.org/w/index.php?t

[Wikidata-bugs] [Maniphest] T322869: Fewer results from wdqs nodes running in codfw than eqiad

2022-11-10 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Reported by User:Oravrattas via https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#Fewer_results_from_wdqs20

[Wikidata-bugs] [Maniphest] T322067: Allow federated queries with Bioontology SPARQL Endpoint

2022-11-03 Thread dcausse
dcausse closed this task as "Declined". dcausse added a comment. This sparql endpoints requires a `csrfmiddlewaretoken` parameter to work and I don't think we can configure blazegraph to provide it (and I have no clue how it is supposed to work nor from where it is obtained

[Wikidata-bugs] [Maniphest] T322067: Allow federated queries with Bioontology SPARQL Endpoint

2022-11-03 Thread dcausse
dcausse added a comment. @Nikki this sparql endpoint states: > Notice: This SPARQL endpoint is maintained by NCBO for demo purposes. It serves as playground to explore BioPortal's ontologies in RDF and we do not recommend its use for production applications or heavy batch proces

[Wikidata-bugs] [Maniphest] T322010: Depool wdqs1007

2022-10-31 Thread dcausse
dcausse lowered the priority of this task from "Unbreak Now!" to "High". dcausse added a comment. While fixing T238751 <https://phabricator.wikimedia.org/T238751> I think the criteria to enable max lag did change from the median of all the servers to the most lagg

[Wikidata-bugs] [Maniphest] T316236: Reload WCQS from dumps

2022-10-25 Thread dcausse
dcausse added a comment. @bking I did not spot any errors, the `not found, terminating` line is expected I guess. The reload cookbook seemed to have taken care of a bunch of the 3 first steps you mention, I think the remaining steps are just to propagate the journal using the data

[Wikidata-bugs] [Maniphest] T316031: Clean up the rdf-streaming-updater-codfw container from thanos-swift.

2022-10-24 Thread dcausse
dcausse added a comment. @bking I see that the doc has been updated, can we move this ticket to the Needs reporting column? TASK DETAIL https://phabricator.wikimedia.org/T316031 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bking, dcausse Cc

[Wikidata-bugs] [Maniphest] T321282: make the Property namespace on Wikidata a content namespace

2022-10-21 Thread dcausse
dcausse added a comment. Note on CirrusSearch and the search indices: this should have no impact as Properties are already part of `wgNamespacesToBeSearchedDefault` and thus already considered "content" for CirrusSearch so moving this namespace (120) to `wgContentNamespaces` will

[Wikidata-bugs] [Maniphest] T316236: Reload WCQS from dumps

2022-10-17 Thread dcausse
dcausse reassigned this task from EBernhardson to bking. dcausse added a comment. @bking what is the status of this ticket? TASK DETAIL https://phabricator.wikimedia.org/T316236 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bking, dcausse Cc

[Wikidata-bugs] [Maniphest] T242453: Detect and alert and/or remediate Blazegraph deadlocks

2022-09-26 Thread dcausse
dcausse removed a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T242453 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: RKemper, bking, RLazarus, Legoktm, Gehel, William_Avery, CDanis, Addshore

[Wikidata-bugs] [Maniphest] T316005: Add monitoring and alerting on the usage of the rdf-streaming-updater swift containers in thanos

2022-09-26 Thread dcausse
dcausse removed a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T316005 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: fgiunchedi, bking, dcausse, Aklapper, Astuthiodit_1, AWesterinen

[Wikidata-bugs] [Maniphest] T316028: Run the rdf-streaming-updater from k8s@codfw

2022-09-26 Thread dcausse
dcausse added a comment. the 2 flink jobs are running from k8s@codfw now (c.f. https://grafana-rw.wikimedia.org/d/gCFgfpG7k/flink-session-cluster?orgId=1&var-datasource=codfw%20prometheus%2Fk8s&var-namespace=rdf-streaming-updater) TASK DETAIL https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T317530: MediaInfo does seem to allow entities to share same statement IDs

2022-09-26 Thread dcausse
dcausse removed a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T317530 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: EBernhardson, WMDE-leszek, bking, Aklapper, dcausse, Astuthiodit_1

[Wikidata-bugs] [Maniphest] T316005: Add monitoring and alerting on the usage of the rdf-streaming-updater swift containers in thanos

2022-09-22 Thread dcausse
dcausse added a comment. The above patch adds quick alert on the space used by `auth_WDQS`, it does not address all the ACs of this ticket but I think is the minimal requirement to make sure we react quickly if similar problems occur in the future. TASK DETAIL https

[Wikidata-bugs] [Maniphest] T316028: Run the rdf-streaming-updater from k8s@codfw

2022-09-22 Thread dcausse
dcausse claimed this task. dcausse moved this task from Prioritized to In Progress on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T316028 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T314835: wdqs space usage on thanos-swift

2022-09-12 Thread dcausse
dcausse closed this task as "Resolved". dcausse claimed this task. dcausse added a comment. @BCornwall yes, this ticket can be closed, remaining work is tracked here: - complete the cleanup: T316031 <https://phabricator.wikimedia.org/T316031> - restore the job in k8s

[Wikidata-bugs] [Maniphest] T317530: MediaInfo does seem to allow entities to share same statement IDs

2022-09-12 Thread dcausse
dcausse triaged this task as "High" priority. dcausse added a comment. Tentatively setting to high as this will cause data consistency issues. From the updater perspective we have to relax this component to allow such data (it will just report a warning and monitor such issues) b

[Wikidata-bugs] [Maniphest] T317530: MediaInfo does seem to allow entities to share same statement IDs

2022-09-12 Thread dcausse
dcausse created this task. dcausse added projects: SDC General, Commons, Structured-Data-Backlog. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION See in: - https://commons.wikimedia.org/wiki/Special:EntityData/M69231551.json - https://commons.wikimedia.org/wiki

[Wikidata-bugs] [Maniphest] T303831: Productionize Wikidata subgraph analysis

2022-09-08 Thread dcausse
dcausse added a comment. Discussed this with Joseph as we believe that having to configure the cleanup job in another repo is not ideal. It seems that the long term approach might be around using the data catalog (https://datahub.wikimedia.org/) to store some retention metadata and have

[Wikidata-bugs] [Maniphest] T316028: Run the rdf-streaming-updater from k8s@codfw

2022-09-08 Thread dcausse
dcausse added a comment. Will be picked up once the cleanup is done (T316031 <https://phabricator.wikimedia.org/T316031>). TASK DETAIL https://phabricator.wikimedia.org/T316028 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc:

[Wikidata-bugs] [Maniphest] T316882: RdfStreamingUpdaterHighConsumerUpdateLag alert is not fired

2022-09-01 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T316882 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, AWesterinen, MPhamWMF, CBogen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE

<    1   2   3   4   5   6   7   8   9   10   >