dcausse added a comment.
Unsure if feasible but perhaps manually flagging list of safe regex & very
popular regex <https://w.wiki/APPB> could help reduce the number of requests to
shellbox?
TASK DETAIL
https://phabricator.wikimedia.org/T214378
EMAIL PREFERENC
dcausse renamed this task from "Request permission to create 4 kafka topics in
kafka-main" to "Request permission to create 4 kafka topics in kafka-main (WDQS
graph split)".
TASK DETAIL
https://phabricator.wikimedia.org/T367510
EMAIL PREFERENCES
https://phabricator.wi
dcausse created this task.
dcausse added projects: Wikidata, Wikidata-Query-Service, serviceops,
Data-Platform-SRE.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: wmde-wikidata-tech.
TASK DESCRIPTION
As part of the work to split the WDQS graph we
dcausse added a comment.
I did some testing and sadly when a wdqs node makes a query to
https://query.wikidata.org it hits varnish again:
from wdqs1020 to https://query.wikidata.org (`echo 'SELECT ?test_dcausse {
?test_dcausse ?p ?o . } LIMIT 1' | curl -f -s --data-urlen
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T365692
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Fnielsen, Lucas_Werkmeister_WMDE, Aklapper, Danny_Benjafield_WMDE,
S8321414, Astuthiodit_1
dcausse moved this task from In Progress to Needs Reporting on the
Discovery-Search (Current work) board.
dcausse added a comment.
Triggered a reindex of all the lexemes using
https://gitlab.wikimedia.org/repos/search-platform/cirrus-rerender, might take
about 3 hours to complete.
TASK
dcausse added a comment.
The system should now index lexemes properly.
We still have to reindex all the lexemes to fix the ones created/edited
before the fix was applied.
TASK DETAIL
https://phabricator.wikimedia.org/T365692
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T365692
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Fnielsen, Lucas_Werkmeister_WMDE, Aklapper, Danny_Benjafield_WMDE,
Isabelladantes1983
dcausse added a comment.
The search fields specific to Lexemes are currently ignored causing this
NOTICE but also preventing lexemes from being searchable (esp. the new ones).
The schemas should be adapted to support these fields and the lexemes will
have to be re-indexed.
TASK DETAIL
dcausse closed this task as a duplicate of T365692: PHP Notice: Undefined
index: lexeme_language / lexical_category.
TASK DETAIL
https://phabricator.wikimedia.org/T365684
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Aklapper, Fnielsen
dcausse merged a task: T365684: Particular lexeme (L1326823) not indexed so
search with the Wikidata API returns nothing.
TASK DETAIL
https://phabricator.wikimedia.org/T365692
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Fnielsen
dcausse claimed this task.
dcausse moved this task from Incoming to In Progress on the Discovery-Search
(Current work) board.
TASK DETAIL
https://phabricator.wikimedia.org/T365692
WORKBOARD
https://phabricator.wikimedia.org/project/board/1227/
EMAIL PREFERENCES
https
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362508
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: RKemper, dr0ptp4kt, bking, dcausse, Aklapper, Danny_Benjafield_WMDE,
S8321414, Astuthiodit_1
dcausse added a comment.
1. Runs hdfs-rsync directly from the blazegraph hosts
- requires installing its dependencies
- open a holes between blazegraph and the hadoop cluster
2. Schedule hdfs-rsync on a stat machine copying the ttl dumps from hdfs to
`/srv/analytics-search
dcausse added a comment.
Another approach could be to use the `/mnt/hdfs` mountpoint? I have been told
that it might not be stable enough but perhaps it's OK for doing a copy?
TASK DETAIL
https://phabricator.wikimedia.org/T349069
EMAIL PREFERENCES
https://phabricator.wikimedi
dcausse added a comment.
Looking at the constraints I believe that 4 may use sparql:
- FormatChecker.php
- TypeChecker.php
- UniqueValueChecker.php
- ValueTypeChecker.php
FormatChecker switched to using shellbox so I think can be ignored.
TypeChecker & ValueTypeChecker
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
The current data-transfer cookbook does assume that a single graph is served
from all wdqs nodes, this will no longer be the case when the graph
dcausse added a comment.
@BTullis @bking I plan to use a cookbook to transfer some data out of hdfs to
blazegraph machines, a naive approach I thought about was to use a temp folder
somewhere in `/srv` of a stat100x machine and then re-use the transferpy
<https://gerrit.wikimedia.or
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362508
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: RKemper, dr0ptp4kt, bking, dcausse, Aklapper, Danny_Benjafield_WMDE,
Isabelladantes1983
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T349069
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Daniel_Mietchen, JAllemandou, dr0ptp4kt, bking, BTullis, dcausse, Aklapper,
Danny_Benjafield_WMDE
dcausse claimed this task.
dcausse moved this task from Incoming to In Progress on the Discovery-Search
(Current work) board.
TASK DETAIL
https://phabricator.wikimedia.org/T349069
WORKBOARD
https://phabricator.wikimedia.org/project/board/1227/
EMAIL PREFERENCES
https
dcausse added a project: Discovery-Search (Current work).
TASK DETAIL
https://phabricator.wikimedia.org/T349069
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Daniel_Mietchen, JAllemandou, dr0ptp4kt, bking, BTullis, dcausse, Aklapper
dcausse moved this task from Ready for Dev -- SWE to Needs review on the
Discovery-Search (Current work) board.
dcausse claimed this task.
TASK DETAIL
https://phabricator.wikimedia.org/T362508
WORKBOARD
https://phabricator.wikimedia.org/project/board/1227/
EMAIL PREFERENCES
https
dcausse claimed this task.
dcausse moved this task from Ready for Dev -- SWE to In Progress on the
Discovery-Search (Current work) board.
TASK DETAIL
https://phabricator.wikimedia.org/T362060
WORKBOARD
https://phabricator.wikimedia.org/project/board/1227/
EMAIL PREFERENCES
https
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362977
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: bking, dcausse, Aklapper, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1,
AWesterinen
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
Reported at
https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#Stale_values_in_SparQL_query_result
- Q968274
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362508
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dr0ptp4kt, bking, dcausse, Aklapper, Danny_Benjafield_WMDE,
Isabelladantes1983, Themindcoder
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362508
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dr0ptp4kt, bking, dcausse, Aklapper, Danny_Benjafield_WMDE,
Isabelladantes1983, Themindcoder
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362508
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dr0ptp4kt, bking, dcausse, Aklapper, Danny_Benjafield_WMDE,
Isabelladantes1983, Themindcoder
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
The updater is misbehaving in codfw, apparently processing too many
`reconciliations` which triggers a //slow// update mode and thus is not able to
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T361935
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Daniel_Mietchen, dr0ptp4kt, pfischer, dcausse, Aklapper,
Danny_Benjafield_WMDE, S8321414
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T361935
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Daniel_Mietchen, dr0ptp4kt, pfischer, dcausse, Aklapper,
Danny_Benjafield_WMDE, S8321414
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
(originally reported
https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#WDQS_wikibase:around_issue)
It might
dcausse added a subtask: T362060: Generalize ScholarlyArticleSplitter.
TASK DETAIL
https://phabricator.wikimedia.org/T337013
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Daniel_Mietchen, Kanashimi, SEgt-WMF, dr0ptp4kt, RKemper, bking
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS.
TASK DETAIL
https://phabricator.wikimedia.org/T362060
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dcausse, Aklapper, AWesterinen, Namenlos314, Gq86
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
The spark job ScholarlyArticleSplitter should be generalized to support the
general case with //n// subgraphs, a wider variety of rules and stubs
dcausse moved this task from Blocked/Waiting to Needs Reporting on the
Discovery-Search (Current work) board.
dcausse added a comment.
Two scholia queries were rewritten:
-
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/Federated_Queries_Examples
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS.
TASK DETAIL
https://phabricator.wikimedia.org/T361950
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Aklapper, dcausse, Danny_Benjafield_WMDE, S8321414
dcausse added a subtask: T361950: Ensure that WDQS query throttling does not
interfere with federation.
TASK DETAIL
https://phabricator.wikimedia.org/T337013
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Kanashimi, SEgt-WMF, dr0ptp4kt
dcausse renamed this task from "Ensure that WDQS query throttling do not
interfere with federation" to "Ensure that WDQS query throttling does not
interfere with federation".
TASK DETAIL
https://phabricator.wikimedia.org/T361950
EMAIL PREFERENCES
https://phabricator.wi
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
When we exposed the 3 experimental endpoints to test the first version of the
graph split we disabled query throttling to avoid impacting the
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T361935
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dr0ptp4kt, pfischer, dcausse, Aklapper, AWesterinen, Namenlos314, Gq86,
Lucas_Werkmeister_WMDE
dcausse added a subtask: T361935: Adapt the WDQS Streaming Updater to update
multiple WDQS subgraphs.
TASK DETAIL
https://phabricator.wikimedia.org/T337013
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Kanashimi, SEgt-WMF, dr0ptp4kt
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS.
TASK DETAIL
https://phabricator.wikimedia.org/T361935
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dr0ptp4kt, pfischer, dcausse, Aklapper, AWesterinen
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
In order to support updating the subgraphs defined in
Wikidata:SPARQL_query_service/WDQS_graph_split
<https://www.wikidata.org/w
dcausse added a comment.
Thanks! I'm not very familiar with alerts being set from grafana neither,
I'll try to get more info on this, worst case we can always set up a new one
directly in alertmanager just for the wdqs lag and sent to the search team
using the same formu
dcausse removed dcausse as the assignee of this task.
dcausse added a comment.
@Lucas_Werkmeister_WMDE thanks! Do you know where we could update this to
include our alert email for such alerts?
TASK DETAIL
https://phabricator.wikimedia.org/T361114
EMAIL PREFERENCES
https
dcausse moved this task from Needs review to Needs Reporting on the
Discovery-Search (Current work) board.
dcausse added a comment.
Should be working properly now
TASK DETAIL
https://phabricator.wikimedia.org/T353683
WORKBOARD
https://phabricator.wikimedia.org/project/board/1227/
EMAIL
dcausse closed this task as "Declined".
dcausse added a comment.
won't be required after all
TASK DETAIL
https://phabricator.wikimedia.org/T361106
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: bking, dcausse
Cc: dcausse,
dcausse closed subtask T361106: Restore wdqs1013 with a data transfer as
"Declined".
TASK DETAIL
https://phabricator.wikimedia.org/T360993
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: bking, Aklapper, dcausse, Danny_Benja
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T361246
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dcausse, Aklapper, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1,
AWesterinen, karapayneWMDE
dcausse moved this task from Backlog to Blocked / Waiting on the
Data-Platform-SRE (2024.03.25 - 2024.04.14) board.
dcausse added a comment.
I restarted the updater on wdqs1013 and it's catching up, I have a note to
check the status tomorrow and will repool it if necessary.
TASK D
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T361246
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dcausse, Aklapper, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1,
AWesterinen, karapayneWMDE
dcausse added a project: Wikidata-Query-Service.
TASK DETAIL
https://phabricator.wikimedia.org/T361246
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dcausse, Aklapper, AWesterinen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE,
EBjune
dcausse added a comment.
I could re-enable puppet on wdqs1013 and restart the updater to catchup on
updates. But apparently this machine was repooled yesterday (as part of the
wdqs scap deploy I suppose) and thus started to serve stale data without
triggering any maxlag. It's wh
dcausse added a comment.
depooling the node we can see that the query rate actually going down to 0,
request rate is generally very low on codfw so we might have to tune the
threshold at around 0.2.
F43663858: image.png <https://phabricator.wikimedia.org/F43663858>
TASK DETAIL
dcausse removed a project: Patch-For-Review.
TASK DETAIL
https://phabricator.wikimedia.org/T336352
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: hoo, dcausse
Cc: Lucas_Werkmeister_WMDE, Aklapper, ItamarWMDE, dcausse,
Danny_Benjafield_WMDE
dcausse added a comment.
The approach taken is:
- from nginx control a new header named 'x-monitoring-query' set to true if a
list of criteria is met (currently using user-agent strings but could be
extended to using source IPs as well I suppose)
- from blazegraph, do not
dcausse moved this task from Incoming to Needs review on the Discovery-Search
(Current work) board.
dcausse claimed this task.
TASK DETAIL
https://phabricator.wikimedia.org/T360993
WORKBOARD
https://phabricator.wikimedia.org/project/board/1227/
EMAIL PREFERENCES
https
dcausse added a comment.
Here are the UAs seen in hour of a depooled server:
+--+-+
|UA|count
dcausse triaged this task as "High" priority.
dcausse added a project: Discovery-Search (Current work).
TASK DETAIL
https://phabricator.wikimedia.org/T360993
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Aklapper, dcausse, A
dcausse added a comment.
Mitigation:
- blazegraph stopped
- updater stopped with the `/srv/wdqs/data_loaded` flag removed
- puppet disabled
TASK DETAIL
https://phabricator.wikimedia.org/T360993
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
Propagating the lag of a wdqs host should only be done if this host is
''pooled'' (actually serving user traffic).
Deter
dcausse moved this task from In Progress to Needs review on the
Discovery-Search (Current work) board.
dcausse added a comment.
draft page:
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/Federation_Limits
TASK DETAIL
https://phabricator.wikimedia.org/T357966
dcausse claimed this task.
dcausse moved this task from Ready for Dev -- SWE to In Progress on the
Discovery-Search (Current work) board.
TASK DETAIL
https://phabricator.wikimedia.org/T357966
WORKBOARD
https://phabricator.wikimedia.org/project/board/1227/
EMAIL PREFERENCES
https
dcausse moved this task from In Progress to Needs review on the
Discovery-Search (Current work) board.
dcausse added a comment.
changed the layout of the query a bit by moving the logistic function
introduced in T271799 <https://phabricator.wikimedia.org/T271799> to the
top-level so t
dcausse claimed this task.
dcausse moved this task from In Progress to Needs Reporting on the
Discovery-Search (Current work) board.
dcausse added a comment.
Compiled 10 real world examples at
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split
dcausse added a comment.
final report available at
https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service/WDQS_Graph_Split_Impact_Analysis
TASK DETAIL
https://phabricator.wikimedia.org/T355040
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To
dcausse added a comment.
@Physikerwelt thanks for your feedback.
Blazegraph is definitely not the best solution and the work to move off of
blazegraph should be tracked under https://phabricator.wikimedia.org/T330525
(see the initial exploration
<https://www.wikidata.org/w
dcausse added a comment.
In T356773#9531179 <https://phabricator.wikimedia.org/T356773#9531179>,
@EgonWillighagen wrote:
> I tried to get the federation working, but got time outs too. The problem
is that the current setup makes splits at a statement level. That is, given
s
dcausse moved this task from To Be Deployed to In Progress on the
Discovery-Search (Current work) board.
dcausse added a comment.
The new builder moved the result to #4 which is better but still not enough
and it's beaten by 3 other images because other cri
dcausse added a comment.
WIP at
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/Federated_Queries_Examples
TASK DETAIL
https://phabricator.wikimedia.org/T357980
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
dcausse added a subtask: T357980: Compile a set of queries rewritten with
federation across the two graph splits.
TASK DETAIL
https://phabricator.wikimedia.org/T337013
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: SEgt-WMF, dr0ptp4kt
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS.
TASK DETAIL
https://phabricator.wikimedia.org/T357980
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Aklapper, dcausse, AWesterinen, Namenlos314, Gq86
dcausse renamed this task from "Compile a set of queries rewritten with
federation accross the two graph splits" to "Compile a set of queries rewritten
with federation across the two graph splits".
TASK DETAIL
https://phabricator.wikimedia.org/T357980
EMAIL
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
Having a set of examples might be helpful for users experimenting with the
graph split.
A subpage under
https://www.wikidata.org/wiki
dcausse added a subtask: T357966: Document limitations of blazegraph federation.
TASK DETAIL
https://phabricator.wikimedia.org/T337013
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: SEgt-WMF, dr0ptp4kt, RKemper, bking, tfmorris, elal
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS.
TASK DETAIL
https://phabricator.wikimedia.org/T357966
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Aklapper, dcausse, AWesterinen, Namenlos314, Gq86
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
Writing a query that federates multiple SPARQL endpoints can be challenging
if the intermediate results that have to be shared are big.
Better
dcausse moved this task from In Progress to Needs review on the
Discovery-Search (Current work) board.
dcausse added a comment.
Draft report up at
https://wikitech.wikimedia.org/wiki/User:DCausse/WDQS_Graph_Split_Impact_Analysis
TASK DETAIL
https://phabricator.wikimedia.org/T355040
dcausse added a comment.
In T353453#9524925 <https://phabricator.wikimedia.org/T353453#9524925>,
@AndrewTavis_WMDE wrote:
> Quick note on this:
>
> There are two ways that need to be factored in to deriving if a query is
from Scholia. Some queries do start with `#to
dcausse added a comment.
WIP:
- included the new 100k queries sample named `QUERY-Q4` from T349512
<https://phabricator.wikimedia.org/T349512> (random sample that is
representative of the query length and runtime)
- the % of affected queries (deduplicated) per tool is (//
dcausse added a comment.
@dr0ptp4kt thanks! is the difference in the number of successful queries only
explained by the improvement in query time or are there some improvements in
the number of queries that timeout as well?
TASK DETAIL
https://phabricator.wikimedia.org/T355037
EMAIL
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T355888
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: RKemper, dcausse, Aklapper, Danny_Benjafield_WMDE, Isabelladantes1983,
Themindcoder, Adamm71
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
Failure seen while
`org.wikidata.query.rdf.spark.transform.queries.sparql.QueryExtractor` was
processing the dataset
dcausse added a comment.
Scanning dumps from 2024/01/21 we can find 1623 duplicated statement ids
(full list here:
https://people.wikimedia.org/~dcausse/T356161_sdc_duplicated_statement_ids.csv)
TASK DETAIL
https://phabricator.wikimedia.org/T356161
EMAIL PREFERENCES
https
dcausse renamed this task from "WikibaseMediaInfo (or Wikibase?) seems to reuse
statement identifiers from other entities" to "WikibaseMediaInfo seems to reuse
statement identifiers from other entities".
dcausse updated the task description.
TASK DETAIL
https://phabr
dcausse added a comment.
@Lucas_Werkmeister_WMDE thanks for all the context! I get that it only
affects WikibaseMediaInfo. Can we exclude Wikibase as a culprit possibly
affecting wikidata or should we run a quick investigation to find possible
duplicated statement identifiers in the
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T356161
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Lucas_Werkmeister_WMDE, dcausse, Aklapper, Danny_Benjafield_WMDE,
Astuthiodit_1, AWesterinen
dcausse created this task.
dcausse added projects: WikibaseMediaInfo, Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Structured-Data-Backlog.
TASK DESCRIPTION
Seen on M130887689
<https://commons.wikimedia.org/w
dcausse added a comment.
WIP:
https://people.wikimedia.org/~dcausse/T355040_EARLY_DRAFT_wdqs_query_results_analysis.html
(UA redacted for now)
TL/DR:
- added support for identifying true positives (queries with a scientific
article in the sparql query or in the results
dcausse added a subtask: T355888: Enable cross federation between experimental
WDQS endpoints.
TASK DETAIL
https://phabricator.wikimedia.org/T351650
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: RKemper, dcausse
Cc: Gehel, bking, dcausse
dcausse added a parent task: T351650: Expose 3 new dedicated WDQS endpoints.
TASK DETAIL
https://phabricator.wikimedia.org/T355888
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: RKemper, dcausse, Aklapper, AWesterinen, BTullis
dcausse created this task.
dcausse added projects: Data-Platform-SRE, Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
Experimental endpoints `query-main-experimental` and
`query-scholarly-experimental` must allow cross federation.
A simple way to
dcausse added a comment.
Quick report on the progress being made:
- Our query logs do not only contains sparql queries and the sparql client
used to collect the data has to be adapted to support these (ASK, CONSTRUCT,
DESCRIBE) (https://gerrit.wikimedia.org/r/c/wikidata/query/rdf
dcausse claimed this task.
dcausse moved this task from Ready for Dev -- SWE to In Progress on the
Discovery-Search (Current work) board.
TASK DETAIL
https://phabricator.wikimedia.org/T353683
WORKBOARD
https://phabricator.wikimedia.org/project/board/1227/
EMAIL PREFERENCES
https
dcausse created this task.
dcausse added projects: Wikidata, Wikidata-Query-Service.
TASK DESCRIPTION
By using a tool to compare the differences of two results of the same sparql
query we should evaluate how many queries might "break" when running against
the wikidata main graph
dcausse added a subtask: T355037: Compare the performance of sparql queries
between the full graph and the subgraphs.
TASK DETAIL
https://phabricator.wikimedia.org/T352538
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: Aklapper, Gehel
dcausse added a parent task: T352538: [EPIC] Evaluate the impact of the graph
split.
TASK DETAIL
https://phabricator.wikimedia.org/T355037
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: dcausse
Cc: dcausse, Aklapper, AWesterinen, Namenlos314, Gq86
dcausse renamed this task from "Com" to "Compare the performance of sparql
queries between the full graph and the subgraphs".
dcausse added a project: Wikidata-Query-Service.
dcausse updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T355037
EMAIL
1 - 100 of 1088 matches
Mail list logo