[Wikidata-bugs] [Maniphest] T289517: DCAT AP endpoint is down

2021-08-31 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Development to Needs review on the Discovery-Search (Current work) board. dcausse added a comment. Reload of the RDF file fails with HTTP ERROR 404 Problem accessing //bigdata/namespace/dcatap20210827/sparql. Reason

[Wikidata-bugs] [Maniphest] T289836: Upgrade to latest flink (1.13.2)

2021-08-26 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Upstream bug we might have seen while running on our k8s setup: - https://issues.apache.org/jira/browse/FLINK-20417 TASK DETAIL https

[Wikidata-bugs] [Maniphest] T289770: Add hints in response headers for 404 responses in Special:EntityData

2021-08-26 Thread dcausse
dcausse added a project: Wikidata-Query-Service. TASK DETAIL https://phabricator.wikimedia.org/T289770 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Zbyszko, Addshore, Aklapper, MPhamWMF, CBogen, Namenlos314, Gq86

[Wikidata-bugs] [Maniphest] T288230: Promote MediaInfo RDF format to stable

2021-08-26 Thread dcausse
dcausse added a comment. The should be updated with the new predicate introduced in T277665 <https://phabricator.wikimedia.org/T277665> first. TASK DETAIL https://phabricator.wikimedia.org/T288230 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreference

[Wikidata-bugs] [Maniphest] T289754: Triple level deduplication

2021-08-25 Thread dcausse
dcausse added a comment. Thanks for this! These are mostly `sitelinks` I think, the information behind them could perhaps be moved into its own context but I'm unsure if it's necessary. It also shows that wikidata may still have duplicated sitelinks which is not good and i

[Wikidata-bugs] [Maniphest] T286938: Create a plan for a final streaming updater rollout as source of truth for blazegraph instances

2021-08-24 Thread dcausse
dcausse added a comment. In T286938#7302853 <https://phabricator.wikimedia.org/T286938#7302853>, @EBernhardson wrote: > A couple thoughts, perhaps one will even be useful: > >> start import on wdqs1009 and wdqs2008 with --skolemize: best case 10 days (import

[Wikidata-bugs] [Maniphest] T242453: Deadlock in blazegraph blocking all queries and updates

2021-08-24 Thread dcausse
dcausse merged a task: T289551: wdqs1012 flatlined after page for wdqs.svc.eqiad.wmnet timing out. dcausse added subscribers: Legoktm, RLazarus. TASK DETAIL https://phabricator.wikimedia.org/T242453 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To

[Wikidata-bugs] [Maniphest] T289551: wdqs1012 flatlined after page for wdqs.svc.eqiad.wmnet timing out

2021-08-24 Thread dcausse
dcausse closed this task as a duplicate of T242453: Deadlock in blazegraph blocking all queries and updates. TASK DETAIL https://phabricator.wikimedia.org/T289551 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, RLazarus, Legoktm

[Wikidata-bugs] [Maniphest] T289551: wdqs1012 flatlined after page for wdqs.svc.eqiad.wmnet timing out

2021-08-24 Thread dcausse
dcausse added a comment. Thanks for depooling this machine! Seeing this in the graph is generally a symptom of T242453 <https://phabricator.wikimedia.org/T242453> so I'm tentatively closing this task as duplicate. As you noted systemd restarted the service because blazegrap

[Wikidata-bugs] [Maniphest] T264006: Deploy Flink (rdf-streaming-updater) to kubernetes (k8s)

2021-08-23 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T264006 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: So9q, Ottomata, jijiki, dcausse, Zbyszko, akosiaris, Mstyles, Gehel, Aklapper, Biggs657

[Wikidata-bugs] [Maniphest] T289428: U+002C comma is not being excluded by default in simple search input box for CirrusSearch

2021-08-23 Thread dcausse
dcausse removed a project: Elasticsearch. TASK DETAIL https://phabricator.wikimedia.org/T289428 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, Thadguidry, Invadibot, MPhamWMF, maantietaja, Wilmanbeno, CBogen

[Wikidata-bugs] [Maniphest] T289428: U+002C comma is not being excluded by default in simple search input box for CirrusSearch

2021-08-23 Thread dcausse
dcausse added a comment. Should be evaluated alongside T237645 <https://phabricator.wikimedia.org/T237645> I think as both these tickets involve the same kind of modifications to the analysis chains. TASK DETAIL https://phabricator.wikimedia.org/T289428 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2021-08-05 Thread dcausse
dcausse removed a parent task: T285710: WDQS lag detection required manual adjustment during DC switchover. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Daniel_Mietchen

[Wikidata-bugs] [Maniphest] T288231: Deploy the wdqs streaming updater to production

2021-08-05 Thread dcausse
dcausse added a parent task: T285710: WDQS lag detection required manual adjustment during DC switchover. TASK DETAIL https://phabricator.wikimedia.org/T288231 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, MPhamWMF

[Wikidata-bugs] [Maniphest] T285710: WDQS lag detection required manual adjustment during DC switchover

2021-08-05 Thread dcausse
dcausse edited subtasks, added: T288231: Deploy the wdqs streaming updater to production; removed: T244590: [Epic] Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T285710 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2021-08-05 Thread dcausse
dcausse added a subtask: T288231: Deploy the wdqs streaming updater to production. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Daniel_Mietchen, Thadguidry, tfmorris

[Wikidata-bugs] [Maniphest] T288231: Deploy the wdqs streaming updater to production

2021-08-05 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T288231 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, MPhamWMF

[Wikidata-bugs] [Maniphest] T288231: Deploy the wdqs streaming updater to production

2021-08-05 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Deployment plan is under discussion at https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service/Streaming_Updater_Rollout_Plan This ticket

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2021-08-05 Thread dcausse
dcausse added a parent task: T285710: WDQS lag detection required manual adjustment during DC switchover. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Daniel_Mietchen

[Wikidata-bugs] [Maniphest] T285710: WDQS lag detection required manual adjustment during DC switchover

2021-08-05 Thread dcausse
dcausse added a subtask: T244590: [Epic] Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T285710 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Legoktm, Gehel, Aklapper

[Wikidata-bugs] [Maniphest] T285710: WDQS lag detection required manual adjustment during DC switchover

2021-08-05 Thread dcausse
dcausse added a comment. In T285710#7261551 <https://phabricator.wikimedia.org/T285710#7261551>, @Legoktm wrote: > @Gehel what ends up consuming that value? Can we have it read the primary DC from conftool? > > For now I've documented this as a

[Wikidata-bugs] [Maniphest] T286938: Create a plan for a final streaming updater rollout as source of truth for blazegraph instances

2021-08-04 Thread dcausse
dcausse moved this task from Ready for Development to Needs review on the Discovery-Search (Current work) board. dcausse claimed this task. dcausse added a comment. Suggested plan: https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service/Streaming_Updater_Rollout_Plan TASK DETAIL https

[Wikidata-bugs] [Maniphest] T287969: Flink SwiftTempAuth reports IllegalAccessError when running with java 11

2021-08-03 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION {"@timestamp":"2021-08-03T16:04:26,345","log.level":"WARN","message":"Fa

[Wikidata-bugs] [Maniphest] T287445: wikidata-query-rdf-maven-release-docker build is too slow and always times out

2021-08-03 Thread dcausse
dcausse moved this task from Waiting to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. The build succeeded yesterday in 12 minutes: 19:13:39 [INFO] 19:13:39 [INFO

[Wikidata-bugs] [Maniphest] T287374: PipelineBot for flink-rdf-streaming-updater does not generate proper deployment-chart patches

2021-07-30 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T287374 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: thcipriani, jeena, Aklapper, dcausse, Biggs657, Invadibot, Lalamarie69

[Wikidata-bugs] [Maniphest] T287374: PipelineBot for flink-rdf-streaming-updater does not generate proper deployment-chart patches

2021-07-30 Thread dcausse
dcausse added a comment. In T287374#7240481 <https://phabricator.wikimedia.org/T287374#7240481>, @jeena wrote: > Currently the promote step doesn't support updating the image tag of another property, but if you'd like I can update it to allow you to specify where

[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-29 Thread dcausse
dcausse added a subscriber: hnowlan. dcausse added a comment. Hi @hnowlan in case you have ideas, we're investigating why the `cirrusSearchElasticaWrite` job is being backlogged more frequently since the switch. There does not seem to be more messages produced to it nor we see

[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-29 Thread dcausse
dcausse added a comment. The ElasticaWrite job seems to be receiving roughly the same amount of messages (150/s per partition on average) before and after the switch. Looking at the partitioned topic ElasticWrite it's heavily backlogged since the switch: F34569267: Capture d’écr

[Wikidata-bugs] [Maniphest] T287445: wikidata-query-rdf-maven-release-docker build is too slow and always times out

2021-07-29 Thread dcausse
dcausse added a comment. Tried the build a second time but it still takes a lot of time (~2hours). Looking at the logs there're seem to be some errors related to the cache: 14:25:55 [wikidata-query-rdf-maven-release-docker] $ /bin/bash -xe /tmp/jenkins3553290645522835861.sh

[Wikidata-bugs] [Maniphest] T287443: Flink jobmanager and taskmanager cannot talk to the k8s api server

2021-07-28 Thread dcausse
dcausse moved this task from In Progress to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. seems to be fixed now by providing explicit K8S client env. TASK DETAIL https://phabricator.wikimedia.org/T287443 WORKBOARD https

[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-28 Thread dcausse
dcausse added a comment. Looking at the metrics of the `cirrusElasticaWrite` job it seems that its backlog time was greatly degraded just after the DC switch (June 28). I don't see anything particular in other dashboards that could explain such difference in latency. TASK DETAIL

[Wikidata-bugs] [Maniphest] T287443: Flink jobmanager and taskmanager cannot talk to the k8s api server

2021-07-28 Thread dcausse
dcausse added a comment. Will set `kubernetes.disable.hostname.verification` to true for the k8s client for now to unblock this. TASK DETAIL https://phabricator.wikimedia.org/T287443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: JMeybohm

[Wikidata-bugs] [Maniphest] T287443: Flink jobmanager and taskmanager cannot talk to the k8s api server

2021-07-27 Thread dcausse
dcausse moved this task from To Be Deployed to In Progress on the Discovery-Search (Current work) board. dcausse added a comment. I'm now getting: {"@timestamp":"2021-07-27T16:59:20,553","log.level":"ERROR","message":"

[Wikidata-bugs] [Maniphest] T287443: Flink jobmanager and taskmanager cannot talk to the k8s api server

2021-07-27 Thread dcausse
dcausse added a project: Discovery-Search (Current work). dcausse triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T287443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: JMeybohm, dcausse

[Wikidata-bugs] [Maniphest] T287445: wikidata-query-rdf-maven-release-docker build is too slow and always times out

2021-07-27 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T287445 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: hashar, dcausse Cc: RKemper, hashar, Aklapper, dcausse, Biggs657, Invadibot, Lalamarie69

[Wikidata-bugs] [Maniphest] T287445: wikidata-query-rdf-maven-release-docker build is too slow and always times out

2021-07-27 Thread dcausse
dcausse added a comment. @hashar thanks! I launched a build for testing (https://integration.wikimedia.org/ci/job/wikidata-query-rdf-maven-release-docker/61) TASK DETAIL https://phabricator.wikimedia.org/T287445 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T286436: Deduplicate triples when loading the wikibase RDF dumps into hive

2021-07-27 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T286436 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AKhatun_WMF, dcausse Cc: AKhatun_WMF, dcausse, Aklapper, JAllemandou, Biggs657, Invadibot

[Wikidata-bugs] [Maniphest] T287445: wikidata-query-rdf-maven-release-docker build is too slow and always times out

2021-07-27 Thread dcausse
dcausse added a comment. Might be related to T273086 <https://phabricator.wikimedia.org/T273086> TASK DETAIL https://phabricator.wikimedia.org/T287445 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, MPhamWMF,

[Wikidata-bugs] [Maniphest] T287445: wikidata-query-rdf-maven-release-docker build is too slow and always times out

2021-07-27 Thread dcausse
dcausse created this task. dcausse added projects: Release-Engineering-Team, Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As a maintainer of wdqs I want the jenkins job to make release to work so that I can publish releases using CI. 08

[Wikidata-bugs] [Maniphest] T287443: Flink jobmanager and taskmanager cannot talk to the k8s api server

2021-07-27 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata-Query-Service, serviceops. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Seen on k8s staging when the jobmanager tries to look up for its leader election config maps: {"@timestamp":"2

[Wikidata-bugs] [Maniphest] T287374: PipelineBot for flink-rdf-streaming-updater does not generate proper deployment-chart patches

2021-07-26 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata-Query-Service, Release Pipeline (Blubber). Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As a maintainer of the rdf-streaming-updater I want the patch on the deployment-chart generated by the PipelineBot

[Wikidata-bugs] [Maniphest] T276469: Cookbooks and / or operation procedures are adapted for the new Flink based WDQS Streaming Updater

2021-07-20 Thread dcausse
dcausse added a comment. Note that there's a WIP patch <https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/670242> to ease the transfer of kafka offsets (while transferring a journal from one node to another) that will no longer maintained in the triple store. The solution

[Wikidata-bugs] [Maniphest] T286935: Find a way to make swift Tempauth usable behind envoy

2021-07-19 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata-Query-Service, serviceops, SRE-swift-storage. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Swift with tmpauth does seem to do some hostname negociation that makes the use of envoy defaults problematic

[Wikidata-bugs] [Maniphest] T285465: Document and analyze the number of parsing errors for parsed WDQS queries

2021-07-19 Thread dcausse
dcausse added a comment. In T285465#7221124 <https://phabricator.wikimedia.org/T285465#7221124>, @JAllemandou wrote: > Thanks @AKhatun_WMF for the analysis. > @dcausse , @Gehel and @MPhamWMF - Do you think it;s worth trying to make our parser being able to process quer

[Wikidata-bugs] [Maniphest] T286890: Checkpoint _metadata has grown up to 70Mb

2021-07-19 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As a maintainer of the wdqs streaming updater I want to understand why the checkpoint _metadata file has grown to 70m (which requires bumping flink

[Wikidata-bugs] [Maniphest] T283663: Deleted lexeme in Wikidata still in triple store (Wikidata Query Service)

2021-07-12 Thread dcausse
dcausse added a comment. Thanks for the report, the current update mechanism is missing some updates and we have to curate the service from time to time, these items should disappear from the query service in a couple hours. TASK DETAIL https://phabricator.wikimedia.org/T283663 EMAIL

[Wikidata-bugs] [Maniphest] T286436: Deduplicate triples when loading the wikibase RDF dumps into hive

2021-07-12 Thread dcausse
dcausse renamed this task from "Deduplicate tiples when loading wikibase RDF dataset into hive" to "Deduplicate triples when loading the wikibase RDF dumps into hive". TASK DETAIL https://phabricator.wikimedia.org/T286436 EMAIL PREFERENCES https://phabricator.wikimedi

[Wikidata-bugs] [Maniphest] T286436: Deduplicate tiples when loading wikibase RDF dataset into hive

2021-07-12 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As a user of the wikidata triples database available in hive I want to have all the triples to be unique so that analysis are more accurate

[Wikidata-bugs] [Maniphest] T282790: Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure

2021-06-24 Thread dcausse
dcausse moved this task from Analysis to Current work on the Wikidata-Query-Service board. dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T282790 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T277443: The streaming updater consumer should log information when divergences are detected

2021-06-24 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Development to Needs review on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T277443 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T274990: Better identify the multiple WDQS updaters via User-Agents

2021-06-22 Thread dcausse
dcausse moved this task from Ready for Development to Needs Reporting on the Discovery-Search (Current work) board. dcausse closed this task as "Resolved". dcausse claimed this task. dcausse added a comment. the updater can now be configure with a user_agent (was done as part o

[Wikidata-bugs] [Maniphest] T284137: Allow federated queries with the Lingua Libre SPARQL endpoint

2021-06-22 Thread dcausse
dcausse moved this task from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. Deployed and available on https://wcqs-beta.wmflabs.org/ via `SERVICE <https://lingualibre.org/sparql>`, will be available on wdqs after the next

[Wikidata-bugs] [Maniphest] T270245: Jmx metrics for blazegraph are no longer visible in grafana

2021-06-21 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Development to In Progress on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T270245 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T280166: Investigate using session cluster for Flink

2021-06-14 Thread dcausse
dcausse closed this task as "Resolved". dcausse assigned this task to Mstyles. dcausse added a comment. We decided to go with the session cluster for now and evaluate moving to app cluster once we have more experience running flink over k8s TASK DETAIL https://phabricator.wik

[Wikidata-bugs] [Maniphest] T264006: Deploy Flink to kubernetes (k8s)

2021-06-14 Thread dcausse
dcausse closed subtask T280166: Investigate using session cluster for Flink as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T264006 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: akosiaris, Mstyles, Gehel

[Wikidata-bugs] [Maniphest] T284137: Allow federated queries with the Lingua Libre SPARQL endpoint

2021-06-11 Thread dcausse
dcausse added a comment. @Seb35 this would make a lot of sense, please update the task description if you do so. TASK DETAIL https://phabricator.wikimedia.org/T284137 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: sbassett, Aklapper

[Wikidata-bugs] [Maniphest] T284137: Allow federated queries with the Lingua Libre SPARQL endpoint

2021-06-11 Thread dcausse
dcausse added a subscriber: VIGNERON. dcausse added a comment. @WikiLucas00 I contacted @VIGNERON who I had been in contact in the past about lingualibre, I just heard back from him. TASK DETAIL https://phabricator.wikimedia.org/T284137 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T275500: Allow host wikisource.org for MediaWiki API queries

2021-06-10 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Development to Needs review on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T275500 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T279810: Allow host meta.wikimedia.org for MediaWiki API queries

2021-06-10 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Development to Needs review on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T279810 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-06-10 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Development to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. I ran a backfill (reaching directly to appservers) using a thread pool of size 6 over 12 workers (72) and the impact on the

[Wikidata-bugs] [Maniphest] T278385: Streaming Updater must make all requests to proxy endpoints

2021-06-09 Thread dcausse
dcausse claimed this task. dcausse moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T278385 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T283663: Deleted lexeme in Wikidata still in triple store (Wikidata Query Service)

2021-06-08 Thread dcausse
dcausse added a comment. I resynced all items present in the deletion log since `2021-03-30T00:00:00Z`, I checked the ones mentioned here and they're gone now. TASK DETAIL https://phabricator.wikimedia.org/T283663 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/

[Wikidata-bugs] [Maniphest] T283663: Deleted lexeme in Wikidata still in triple store (Wikidata Query Service)

2021-06-08 Thread dcausse
dcausse moved this task from Ready for Development to In Progress on the Discovery-Search (Current work) board. dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T283663 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T283599: Add support for schema evolution of serialized objects

2021-06-07 Thread dcausse
dcausse triaged this task as "Medium" priority. TASK DETAIL https://phabricator.wikimedia.org/T283599 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, Invadibot, Lalamarie69, MPhamWMF, maantietaja, A

[Wikidata-bugs] [Maniphest] T283599: Add support for schema evolution of serialized objects

2021-06-07 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T283599 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, Invadibot, Lalamarie69, MPhamWMF, maantietaja, Alter-paule, Beast1978, CBogen

[Wikidata-bugs] [Maniphest] T283599: Add support for schema evolution of serialized objects

2021-06-07 Thread dcausse
dcausse claimed this task. dcausse moved this task from Incoming to Needs review on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T283599 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T284040: Enable blank node skolemization on wcqs

2021-06-01 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T284040 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, Multichill, dcausse, MPhamWMF, CBogen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T284040: Enable blank node skolemization on wcqs

2021-06-01 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As a WCQS user I would like to have blank nodes skolemized so that I can test if my queries/tools/processes are affected by this changed. As

[Wikidata-bugs] [Maniphest] T266470: Expose wdqs1009 to wdqs users and gather feedback

2021-06-01 Thread dcausse
dcausse added a comment. @Multichill thanks for the suggestion! Filed T284040 <https://phabricator.wikimedia.org/T284040> to make this happen. TASK DETAIL https://phabricator.wikimedia.org/T266470 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreference

[Wikidata-bugs] [Maniphest] T283599: Add support for schema evolution of serialized objects

2021-05-25 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As a maintainer of the wdqs streaming updater I want all the objects serialized in the pipeline to support schema upgrades so that I don't ha

[Wikidata-bugs] [Maniphest] T283591: StateExtractionJob is too slow

2021-05-25 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As a maintainer of the wdqs streaming updater I want the StateExtractionJob to run in a reasonable amount of time so that I don't have to dow

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2021-05-25 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Daniel_Mietchen, Thadguidry, tfmorris, revi, Ladsgroup, Multichill, darthmon_wmde

[Wikidata-bugs] [Maniphest] T266470: Expose wdqs1009 to wdqs users and gather feedback

2021-05-25 Thread dcausse
dcausse added a comment. @Multichill It was announced here: https://lists.wikimedia.org/hyperkitty/list/wikid...@lists.wikimedia.org/message/LPOHD3J3IX74A6BXV2YYCCFURFTOVDHJ/ Feedback (that I'm aware of) we received so far has been here: - https://www.wikidata.org

[Wikidata-bugs] [Maniphest] T262942: PoC on anomaly detection with Flink

2021-05-17 Thread dcausse
dcausse removed projects: Patch-For-Review, Discovery-Search (Current work). dcausse removed dcausse as the assignee of this task. dcausse added a comment. Made https://github.com/nomoa/flink-python-demo but stopped actively working on this for the moment, have hit issues with python env

[Wikidata-bugs] [Maniphest] T271851: Clean up gui from the wdqs deploy repo and puppet

2021-05-17 Thread dcausse
dcausse moved this task from In Progress to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. Seems like everything have been done for this ticket. TASK DETAIL https://phabricator.wikimedia.org/T271851 WORKBOARD https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T199219: WDQS should use internal endpoint to communicate to Wikidata

2021-05-17 Thread dcausse
dcausse moved this task from Needs review to To Be Deployed on the Discovery-Search (Current work) board. dcausse assigned this task to Mstyles. TASK DETAIL https://phabricator.wikimedia.org/T199219 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T278246: Report WDQS update latency when displaying/serving results

2021-05-10 Thread dcausse
dcausse merged a task: T282398: WDQS 'how up to date is the data' feature is misleading, is looking at the wrong thing. dcausse added a subscriber: Tagishsimon. TASK DETAIL https://phabricator.wikimedia.org/T278246 EMAIL PREFERENCES https://phabricator.wikimedia.org/sett

[Wikidata-bugs] [Maniphest] T282398: WDQS 'how up to date is the data' feature is misleading, is looking at the wrong thing

2021-05-10 Thread dcausse
dcausse closed this task as a duplicate of T278246: Report WDQS update latency when displaying/serving results . TASK DETAIL https://phabricator.wikimedia.org/T282398 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, Tagishsimon

[Wikidata-bugs] [Maniphest] T282222: SPARQL query for all painting stopped returning results

2021-05-07 Thread dcausse
dcausse added a comment. My bad, the depool command I ran this morning did not work due to an error running the command (I ran `sudo depool wdqs1012` instead of `sudo depool` so it depooled the service `wdqs1012` which is nonexistent). For the record I ran the proper command now

[Wikidata-bugs] [Maniphest] T282103: WDQS unit tests should not rely on external resources

2021-05-06 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As a wdqs maintainer I want to run the test suites without an access to internet so that I can build the project on restricted environments like the

[Wikidata-bugs] [Maniphest] T275068: Get baseline measurements/expectations for splitting lexemes from Wikidata graph

2021-04-21 Thread dcausse
dcausse added a comment. In T275068#7021725 <https://phabricator.wikimedia.org/T275068#7021725>, @MPhamWMF wrote: > Thanks, @dcausse! > Do you know what percentage of total queries 529097 and 357917 are? I hear you on not trusting these numbers, and I think ballparking is

[Wikidata-bugs] [Maniphest] T275068: Get baseline measurements/expectations for splitting lexemes from Wikidata graph

2021-04-20 Thread dcausse
dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board. dcausse added a comment. > percentage, number of WDQS queries per month that involve Lexemes > >> percentage, number of the above queries that only involve Lexemes (i.e. doe

[Wikidata-bugs] [Maniphest] T280462: bd:sample is not documented in the WDQS manual

2021-04-19 Thread dcausse
dcausse closed this task as "Declined". dcausse added a comment. `bd:sample` is a blazegraph feature and should be documented on the blazegraph wiki which is referenced from https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual#Blazegraph_extensions. TASK DETA

[Wikidata-bugs] [Maniphest] T94019: Generate RDF from JSON

2021-04-19 Thread dcausse
dcausse added a comment. Indeed, the RDF data is available in the hive table `discovery.wikibase_rdf` but it is generated reading the TTL dumps so it might not help for this particular task. Using hadoop will indeed allow to process the json efficiently but has drawbacks as already

[Wikidata-bugs] [Maniphest] T275068: Get baseline measurements/expectations for splitting lexemes from Wikidata graph

2021-04-19 Thread dcausse
dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T275068 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lydia_Pintscher, DVrandecic, Lucas_Werkmeister_WMDE, Aklapper, MPhamWMF, Invadibot, maantietaja, CBogen

[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-04-15 Thread dcausse
dcausse added a comment. Restricted Application added a project: wdwb-tech. Since we are going to use envoy to contact MW applications servers I wonder if this kind of limits could be enforced by it? Today I think that wdqs updaters are talking to the edge caches and some requests might

[Wikidata-bugs] [Maniphest] T273098: High Availability Flink

2021-04-14 Thread dcausse
dcausse added a comment. In T273098#6997661 <https://phabricator.wikimedia.org/T273098#6997661>, @JMeybohm wrote: > I do see that using the configmap election method is appealing as it is build in and does not require additional software to function. Unfortunately I was no

[Wikidata-bugs] [Maniphest] T279698: WDQS should retry when getting 404s

2021-04-08 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T279698 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314

[Wikidata-bugs] [Maniphest] T279698: WDQS should retry when getting 404s

2021-04-08 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T279698 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314

[Wikidata-bugs] [Maniphest] T279698: WDQS should retry when getting 404s

2021-04-08 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a maintainer of the wdqs streaming updater I want requests to Special:EntityData receiving a

[Wikidata-bugs] [Maniphest] T279639: Items sometimes repeat in the Search and Item dropdowns

2021-04-08 Thread dcausse
dcausse added a comment. Another `weird` behavior is that you can expand the 7 results without asking for more: Steps to reproduce: - copy "te" into your paste buffer - enter the search widget - paste and wait for the 7 results to be suggested - delete the searc

[Wikidata-bugs] [Maniphest] T279639: Items sometimes repeat in the Search and Item dropdowns

2021-04-08 Thread dcausse
dcausse added a comment. @Moebeus thanks for the report, do you know if the duplicates appear after clicking `more` to display the remaining results or directly? If they appear directly could you check by scrolling down if all the first 7 results are duplicated? By default only 7 items

[Wikidata-bugs] [Maniphest] T279639: Items sometimes repeat in the Search and Item dropdowns

2021-04-08 Thread dcausse
dcausse edited projects, added Discovery-Search; removed Discovery. TASK DETAIL https://phabricator.wikimedia.org/T279639 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lea_Lacroix_WMDE, dcausse, Gehel, Aklapper, Moebeus, Invadibot

[Wikidata-bugs] [Maniphest] T279541: Add a reconciliation strategy to the wdqs streaming updater

2021-04-07 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T279541 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2021-04-07 Thread dcausse
dcausse added a subtask: T279541: Add a reconciliation strategy to the wdqs streaming updater. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Daniel_Mietchen, Thadguidry

[Wikidata-bugs] [Maniphest] T279541: Add a reconciliation strategy to the wdqs streaming updater

2021-04-07 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T279541 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, Invadibot

[Wikidata-bugs] [Maniphest] T279541: Add a reconciliation strategy to the wdqs streaming updater

2021-04-07 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T279541 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314

[Wikidata-bugs] [Maniphest] T279541: Add a reconciliation strategy to the wdqs streaming updater

2021-04-07 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a maintainer of WDQS I want the streaming updater to be able to reconcile a wikibase item so

[Wikidata-bugs] [Maniphest] T270476: Linked Data Fragments endpoint returns IllegalStateException

2021-03-31 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Development to In Progress on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T270476 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T278693: Manually purge obsolete/outdated entites from WDQS (2021-03)

2021-03-30 Thread dcausse
dcausse closed this task as "Resolved". dcausse claimed this task. dcausse added a comment. In T278693#6956808 <https://phabricator.wikimedia.org/T278693#6956808>, @MisterSynergy wrote: > I read the announcement and I am pretty excited about the improvements. The qu

<    1   2   3   4   5   6   7   8   9   10   >